from:"Pengcheng Wang via llvm\-branch\-commits"

[llvm-branch-commits] [RISCV][MC] Warn if SEW/LMUL may not be compatible (PR #94313)

2024-06-03 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp created 
https://github.com/llvm/llvm-project/pull/94313

According to RVV spec:
> In general, the requirement is to support LMUL ≥ SEWMIN/ELEN,
> where SEWMIN is the narrowest supported SEW value and ELEN is
> the widest supported SEW value.
>
> For a given supported fractional LMUL setting, implementations
> must support SEW settings between SEWMIN and LMUL * ELEN, inclusive.

We print a warning if these requirements are not met.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [RISCV][MC] Warn if SEW/LMUL may not be compatible (PR #94313)

2024-06-04 Thread Pengcheng Wang via llvm-branch-commits



@@ -71,18 +73,21 @@ vsetvli a2, a0, e32, m8, ta, ma
 
 vsetvli a2, a0, e32, mf2, ta, ma
 # CHECK-INST: vsetvli a2, a0, e32, mf2, ta, ma
+# CHECK-WARNING: :[[#@LINE-2]]:17: warning: SEW > 16 may not be compatible 
with all RVV implementations{{$}}
 # CHECK-ENCODING: [0x57,0x76,0x75,0x0d]
 # CHECK-ERROR: instruction requires the following: 'V' (Vector Extension for 
Application Processors), 'Zve32x' (Vector Extensions for Embedded 
Processors){{$}}
 # CHECK-UNKNOWN: 0d757657 
 
 vsetvli a2, a0, e32, mf4, ta, ma
 # CHECK-INST: vsetvli a2, a0, e32, mf4, ta, ma
+# CHECK-WARNING: :[[#@LINE-2]]:17: warning: SEW > 8 may not be compatible with 
all RVV implementations{{$}}

wangpc-pp wrote:

I suppose we should.
One example I just met days ago:
```c
int main() {
  asm("vsetivli zero, 1, e32, mf4, ta, ma\n"
  "csrr a0, vtype\n"
  "csrr a1, vlenb\n"
  "vmv.s.x  v24, a6");
  return 0;
}
```
For MF4, the must supported SEWs are [8, 16], while the SEW is 32 here.
For C908 core on K230 board, this code will crash with an `Illegal Instruction` 
error, but not on `qemu`.
The `vlen` of C908 is 128 bits,  theoretically, for MF4, we can store one 
element in a vector register. But the implementation sets `vill` for this 
configuration (this is not against the RVV spec, though).
So I think we should warn as it can be incompatible.

https://github.com/llvm/llvm-project/pull/94313
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [RISCV][MC] Warn if SEW/LMUL may not be compatible (PR #94313)

2024-06-05 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/94313


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [RISCV][MC] Warn if SEW/LMUL may not be compatible (PR #94313)

2024-06-05 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/94313


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [RISCV][MC] Warn if SEW/LMUL may not be compatible (PR #94313)

2024-06-05 Thread Pengcheng Wang via llvm-branch-commits


wangpc-pp wrote:

We may need it in binutils too. cc @kito-cheng

https://github.com/llvm/llvm-project/pull/94313
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV][MC] Warn if SEW/LMUL may not be compatible (PR #94313)

2024-06-06 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/94313

>From 6e3d6329300e27a23481df3e6e01b9763a34d9d2 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Thu, 6 Jun 2024 15:05:20 +0800
Subject: [PATCH] Address comments

Created using spr 1.3.6-beta.1
---
 .../Target/RISCV/AsmParser/RISCVAsmParser.cpp   | 17 +++--
 llvm/test/MC/RISCV/rvv/vsetvl.s |  6 +++---
 2 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp 
b/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
index 4091f9d5e9f8a..49f21e46f34ea 100644
--- a/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
+++ b/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
@@ -2160,10 +2160,9 @@ bool RISCVAsmParser::parseVTypeToken(const AsmToken 
&Tok, VTypeState &State,
   unsigned ELEN = STI->hasFeature(RISCV::FeatureStdExtZve64x) ? 64 : 32;
   unsigned MinLMUL = ELEN / 8;
   if (Lmul > MinLMUL)
-Warning(
-Tok.getLoc(),
-Twine("The use of vtype encodings with LMUL < SEWMIN/ELEN == mf") +
-Twine(MinLMUL) + Twine(" is reserved"));
+Warning(Tok.getLoc(),
+"use of vtype encodings with LMUL < SEWMIN/ELEN == mf" +
+Twine(MinLMUL) + " is reserved");
 }
 
 State = VTypeState_TailPolicy;
@@ -2228,12 +2227,10 @@ ParseStatus RISCVAsmParser::parseVTypeI(OperandVector 
&Operands) {
   unsigned MaxSEW = ELEN / Lmul;
   // If MaxSEW < 8, we should have printed warning about reserved LMUL.
   if (MaxSEW >= 8 && Sew > MaxSEW)
-Warning(
-SEWLoc,
-Twine("The use of vtype encodings with SEW > ") + Twine(MaxSEW) +
-Twine(" and LMUL == ") + Twine(Fractional ? "mf" : "m") +
-Twine(Lmul) +
-Twine(" may not be compatible with all RVV implementations"));
+Warning(SEWLoc,
+"use of vtype encodings with SEW > " + Twine(MaxSEW) +
+" and LMUL == " + (Fractional ? "mf" : "m") + Twine(Lmul) +
+" may not be compatible with all RVV implementations");
 }
 
 unsigned VTypeI =
diff --git a/llvm/test/MC/RISCV/rvv/vsetvl.s b/llvm/test/MC/RISCV/rvv/vsetvl.s
index 207daf392bd50..2741def0eeff2 100644
--- a/llvm/test/MC/RISCV/rvv/vsetvl.s
+++ b/llvm/test/MC/RISCV/rvv/vsetvl.s
@@ -73,21 +73,21 @@ vsetvli a2, a0, e32, m8, ta, ma
 
 vsetvli a2, a0, e32, mf2, ta, ma
 # CHECK-INST: vsetvli a2, a0, e32, mf2, ta, ma
-# CHECK-ZVE32X: :[[#@LINE-2]]:17: warning: The use of vtype encodings with SEW 
> 16 and LMUL == mf2 may not be compatible with all RVV implementations{{$}}
+# CHECK-ZVE32X: :[[#@LINE-2]]:17: warning: use of vtype encodings with SEW > 
16 and LMUL == mf2 may not be compatible with all RVV implementations{{$}}
 # CHECK-ENCODING: [0x57,0x76,0x75,0x0d]
 # CHECK-ERROR: instruction requires the following: 'V' (Vector Extension for 
Application Processors), 'Zve32x' (Vector Extensions for Embedded 
Processors){{$}}
 # CHECK-UNKNOWN: 0d757657 
 
 vsetvli a2, a0, e32, mf4, ta, ma
 # CHECK-INST: vsetvli a2, a0, e32, mf4, ta, ma
-# CHECK-ZVE32X: :[[#@LINE-2]]:17: warning: The use of vtype encodings with SEW 
> 8 and LMUL == mf4 may not be compatible with all RVV implementations{{$}}
+# CHECK-ZVE32X: :[[#@LINE-2]]:17: warning: use of vtype encodings with SEW > 8 
and LMUL == mf4 may not be compatible with all RVV implementations{{$}}
 # CHECK-ENCODING: [0x57,0x76,0x65,0x0d]
 # CHECK-ERROR: instruction requires the following: 'V' (Vector Extension for 
Application Processors), 'Zve32x' (Vector Extensions for Embedded 
Processors){{$}}
 # CHECK-UNKNOWN: 0d657657 
 
 vsetvli a2, a0, e32, mf8, ta, ma
 # CHECK-INST: vsetvli a2, a0, e32, mf8, ta, ma
-# CHECK-ZVE32X: :[[#@LINE-2]]:22: warning: The use of vtype encodings with 
LMUL < SEWMIN/ELEN == mf4 is reserved{{$}}
+# CHECK-ZVE32X: :[[#@LINE-2]]:22: warning: use of vtype encodings with LMUL < 
SEWMIN/ELEN == mf4 is reserved{{$}}
 # CHECK-ENCODING: [0x57,0x76,0x55,0x0d]
 # CHECK-ERROR: instruction requires the following: 'V' (Vector Extension for 
Application Processors), 'Zve32x' (Vector Extensions for Embedded 
Processors){{$}}
 # CHECK-UNKNOWN: 0d557657 

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV][MC] Warn if SEW/LMUL may not be compatible (PR #94313)

2024-06-06 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/94313

>From 6e3d6329300e27a23481df3e6e01b9763a34d9d2 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Thu, 6 Jun 2024 15:05:20 +0800
Subject: [PATCH 1/2] Address comments

Created using spr 1.3.6-beta.1
---
 .../Target/RISCV/AsmParser/RISCVAsmParser.cpp   | 17 +++--
 llvm/test/MC/RISCV/rvv/vsetvl.s |  6 +++---
 2 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp 
b/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
index 4091f9d5e9f8a..49f21e46f34ea 100644
--- a/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
+++ b/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
@@ -2160,10 +2160,9 @@ bool RISCVAsmParser::parseVTypeToken(const AsmToken 
&Tok, VTypeState &State,
   unsigned ELEN = STI->hasFeature(RISCV::FeatureStdExtZve64x) ? 64 : 32;
   unsigned MinLMUL = ELEN / 8;
   if (Lmul > MinLMUL)
-Warning(
-Tok.getLoc(),
-Twine("The use of vtype encodings with LMUL < SEWMIN/ELEN == mf") +
-Twine(MinLMUL) + Twine(" is reserved"));
+Warning(Tok.getLoc(),
+"use of vtype encodings with LMUL < SEWMIN/ELEN == mf" +
+Twine(MinLMUL) + " is reserved");
 }
 
 State = VTypeState_TailPolicy;
@@ -2228,12 +2227,10 @@ ParseStatus RISCVAsmParser::parseVTypeI(OperandVector 
&Operands) {
   unsigned MaxSEW = ELEN / Lmul;
   // If MaxSEW < 8, we should have printed warning about reserved LMUL.
   if (MaxSEW >= 8 && Sew > MaxSEW)
-Warning(
-SEWLoc,
-Twine("The use of vtype encodings with SEW > ") + Twine(MaxSEW) +
-Twine(" and LMUL == ") + Twine(Fractional ? "mf" : "m") +
-Twine(Lmul) +
-Twine(" may not be compatible with all RVV implementations"));
+Warning(SEWLoc,
+"use of vtype encodings with SEW > " + Twine(MaxSEW) +
+" and LMUL == " + (Fractional ? "mf" : "m") + Twine(Lmul) +
+" may not be compatible with all RVV implementations");
 }
 
 unsigned VTypeI =
diff --git a/llvm/test/MC/RISCV/rvv/vsetvl.s b/llvm/test/MC/RISCV/rvv/vsetvl.s
index 207daf392bd50..2741def0eeff2 100644
--- a/llvm/test/MC/RISCV/rvv/vsetvl.s
+++ b/llvm/test/MC/RISCV/rvv/vsetvl.s
@@ -73,21 +73,21 @@ vsetvli a2, a0, e32, m8, ta, ma
 
 vsetvli a2, a0, e32, mf2, ta, ma
 # CHECK-INST: vsetvli a2, a0, e32, mf2, ta, ma
-# CHECK-ZVE32X: :[[#@LINE-2]]:17: warning: The use of vtype encodings with SEW 
> 16 and LMUL == mf2 may not be compatible with all RVV implementations{{$}}
+# CHECK-ZVE32X: :[[#@LINE-2]]:17: warning: use of vtype encodings with SEW > 
16 and LMUL == mf2 may not be compatible with all RVV implementations{{$}}
 # CHECK-ENCODING: [0x57,0x76,0x75,0x0d]
 # CHECK-ERROR: instruction requires the following: 'V' (Vector Extension for 
Application Processors), 'Zve32x' (Vector Extensions for Embedded 
Processors){{$}}
 # CHECK-UNKNOWN: 0d757657 
 
 vsetvli a2, a0, e32, mf4, ta, ma
 # CHECK-INST: vsetvli a2, a0, e32, mf4, ta, ma
-# CHECK-ZVE32X: :[[#@LINE-2]]:17: warning: The use of vtype encodings with SEW 
> 8 and LMUL == mf4 may not be compatible with all RVV implementations{{$}}
+# CHECK-ZVE32X: :[[#@LINE-2]]:17: warning: use of vtype encodings with SEW > 8 
and LMUL == mf4 may not be compatible with all RVV implementations{{$}}
 # CHECK-ENCODING: [0x57,0x76,0x65,0x0d]
 # CHECK-ERROR: instruction requires the following: 'V' (Vector Extension for 
Application Processors), 'Zve32x' (Vector Extensions for Embedded 
Processors){{$}}
 # CHECK-UNKNOWN: 0d657657 
 
 vsetvli a2, a0, e32, mf8, ta, ma
 # CHECK-INST: vsetvli a2, a0, e32, mf8, ta, ma
-# CHECK-ZVE32X: :[[#@LINE-2]]:22: warning: The use of vtype encodings with 
LMUL < SEWMIN/ELEN == mf4 is reserved{{$}}
+# CHECK-ZVE32X: :[[#@LINE-2]]:22: warning: use of vtype encodings with LMUL < 
SEWMIN/ELEN == mf4 is reserved{{$}}
 # CHECK-ENCODING: [0x57,0x76,0x55,0x0d]
 # CHECK-ERROR: instruction requires the following: 'V' (Vector Extension for 
Application Processors), 'Zve32x' (Vector Extensions for Embedded 
Processors){{$}}
 # CHECK-UNKNOWN: 0d557657 

>From 6b44742cfcc24a07408bbe20070f57ebaa4e9066 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Fri, 7 Jun 2024 11:27:13 +0800
Subject: [PATCH 2/2] Remove Fractional

Created using spr 1.3.6-beta.1
---
 llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp 
b/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
index 49f21e46f34ea..ca11d155ec7c6 100644
--- a/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
+++ b/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
@@ -2229,7 +2229,7 @@ ParseStatus RISCVAsmParser::parseVTypeI(OperandVector 
&Operands) {
   if (MaxSEW >= 8 && Sew > MaxSEW)

[llvm-branch-commits] [llvm] [RISCV][MC] Warn if SEW/LMUL may not be compatible (PR #94313)

2024-06-06 Thread Pengcheng Wang via llvm-branch-commits



@@ -2211,6 +,16 @@ ParseStatus RISCVAsmParser::parseVTypeI(OperandVector 
&Operands) {
 
   if (getLexer().is(AsmToken::EndOfStatement) && State == VTypeState_Done) {
 RISCVII::VLMUL VLMUL = RISCVVType::encodeLMUL(Lmul, Fractional);
+if (Fractional) {
+  unsigned ELEN = STI->hasFeature(RISCV::FeatureStdExtZve64x) ? 64 : 32;
+  unsigned MaxSEW = ELEN / Lmul;
+  // If MaxSEW < 8, we should have printed warning about reserved LMUL.
+  if (MaxSEW >= 8 && Sew > MaxSEW)
+Warning(SEWLoc,
+"use of vtype encodings with SEW > " + Twine(MaxSEW) +
+" and LMUL == " + (Fractional ? "mf" : "m") + Twine(Lmul) +

wangpc-pp wrote:

Good catch!

https://github.com/llvm/llvm-project/pull/94313
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV][MC] Warn if SEW/LMUL may not be compatible (PR #94313)

2024-06-10 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/94313

>From 6e3d6329300e27a23481df3e6e01b9763a34d9d2 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Thu, 6 Jun 2024 15:05:20 +0800
Subject: [PATCH 1/2] Address comments

Created using spr 1.3.6-beta.1
---
 .../Target/RISCV/AsmParser/RISCVAsmParser.cpp   | 17 +++--
 llvm/test/MC/RISCV/rvv/vsetvl.s |  6 +++---
 2 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp 
b/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
index 4091f9d5e9f8a..49f21e46f34ea 100644
--- a/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
+++ b/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
@@ -2160,10 +2160,9 @@ bool RISCVAsmParser::parseVTypeToken(const AsmToken 
&Tok, VTypeState &State,
   unsigned ELEN = STI->hasFeature(RISCV::FeatureStdExtZve64x) ? 64 : 32;
   unsigned MinLMUL = ELEN / 8;
   if (Lmul > MinLMUL)
-Warning(
-Tok.getLoc(),
-Twine("The use of vtype encodings with LMUL < SEWMIN/ELEN == mf") +
-Twine(MinLMUL) + Twine(" is reserved"));
+Warning(Tok.getLoc(),
+"use of vtype encodings with LMUL < SEWMIN/ELEN == mf" +
+Twine(MinLMUL) + " is reserved");
 }
 
 State = VTypeState_TailPolicy;
@@ -2228,12 +2227,10 @@ ParseStatus RISCVAsmParser::parseVTypeI(OperandVector 
&Operands) {
   unsigned MaxSEW = ELEN / Lmul;
   // If MaxSEW < 8, we should have printed warning about reserved LMUL.
   if (MaxSEW >= 8 && Sew > MaxSEW)
-Warning(
-SEWLoc,
-Twine("The use of vtype encodings with SEW > ") + Twine(MaxSEW) +
-Twine(" and LMUL == ") + Twine(Fractional ? "mf" : "m") +
-Twine(Lmul) +
-Twine(" may not be compatible with all RVV implementations"));
+Warning(SEWLoc,
+"use of vtype encodings with SEW > " + Twine(MaxSEW) +
+" and LMUL == " + (Fractional ? "mf" : "m") + Twine(Lmul) +
+" may not be compatible with all RVV implementations");
 }
 
 unsigned VTypeI =
diff --git a/llvm/test/MC/RISCV/rvv/vsetvl.s b/llvm/test/MC/RISCV/rvv/vsetvl.s
index 207daf392bd50..2741def0eeff2 100644
--- a/llvm/test/MC/RISCV/rvv/vsetvl.s
+++ b/llvm/test/MC/RISCV/rvv/vsetvl.s
@@ -73,21 +73,21 @@ vsetvli a2, a0, e32, m8, ta, ma
 
 vsetvli a2, a0, e32, mf2, ta, ma
 # CHECK-INST: vsetvli a2, a0, e32, mf2, ta, ma
-# CHECK-ZVE32X: :[[#@LINE-2]]:17: warning: The use of vtype encodings with SEW 
> 16 and LMUL == mf2 may not be compatible with all RVV implementations{{$}}
+# CHECK-ZVE32X: :[[#@LINE-2]]:17: warning: use of vtype encodings with SEW > 
16 and LMUL == mf2 may not be compatible with all RVV implementations{{$}}
 # CHECK-ENCODING: [0x57,0x76,0x75,0x0d]
 # CHECK-ERROR: instruction requires the following: 'V' (Vector Extension for 
Application Processors), 'Zve32x' (Vector Extensions for Embedded 
Processors){{$}}
 # CHECK-UNKNOWN: 0d757657 
 
 vsetvli a2, a0, e32, mf4, ta, ma
 # CHECK-INST: vsetvli a2, a0, e32, mf4, ta, ma
-# CHECK-ZVE32X: :[[#@LINE-2]]:17: warning: The use of vtype encodings with SEW 
> 8 and LMUL == mf4 may not be compatible with all RVV implementations{{$}}
+# CHECK-ZVE32X: :[[#@LINE-2]]:17: warning: use of vtype encodings with SEW > 8 
and LMUL == mf4 may not be compatible with all RVV implementations{{$}}
 # CHECK-ENCODING: [0x57,0x76,0x65,0x0d]
 # CHECK-ERROR: instruction requires the following: 'V' (Vector Extension for 
Application Processors), 'Zve32x' (Vector Extensions for Embedded 
Processors){{$}}
 # CHECK-UNKNOWN: 0d657657 
 
 vsetvli a2, a0, e32, mf8, ta, ma
 # CHECK-INST: vsetvli a2, a0, e32, mf8, ta, ma
-# CHECK-ZVE32X: :[[#@LINE-2]]:22: warning: The use of vtype encodings with 
LMUL < SEWMIN/ELEN == mf4 is reserved{{$}}
+# CHECK-ZVE32X: :[[#@LINE-2]]:22: warning: use of vtype encodings with LMUL < 
SEWMIN/ELEN == mf4 is reserved{{$}}
 # CHECK-ENCODING: [0x57,0x76,0x55,0x0d]
 # CHECK-ERROR: instruction requires the following: 'V' (Vector Extension for 
Application Processors), 'Zve32x' (Vector Extensions for Embedded 
Processors){{$}}
 # CHECK-UNKNOWN: 0d557657 

>From 6b44742cfcc24a07408bbe20070f57ebaa4e9066 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Fri, 7 Jun 2024 11:27:13 +0800
Subject: [PATCH 2/2] Remove Fractional

Created using spr 1.3.6-beta.1
---
 llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp 
b/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
index 49f21e46f34ea..ca11d155ec7c6 100644
--- a/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
+++ b/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
@@ -2229,7 +2229,7 @@ ParseStatus RISCVAsmParser::parseVTypeI(OperandVector 
&Operands) {
   if (MaxSEW >= 8 && Sew > MaxSEW)

[llvm-branch-commits] [llvm] [RISCV][MC] Warn if SEW/LMUL may not be compatible (PR #94313)

2024-06-10 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/94313

>From 6e3d6329300e27a23481df3e6e01b9763a34d9d2 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Thu, 6 Jun 2024 15:05:20 +0800
Subject: [PATCH 1/2] Address comments

Created using spr 1.3.6-beta.1
---
 .../Target/RISCV/AsmParser/RISCVAsmParser.cpp   | 17 +++--
 llvm/test/MC/RISCV/rvv/vsetvl.s |  6 +++---
 2 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp 
b/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
index 4091f9d5e9f8a..49f21e46f34ea 100644
--- a/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
+++ b/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
@@ -2160,10 +2160,9 @@ bool RISCVAsmParser::parseVTypeToken(const AsmToken 
&Tok, VTypeState &State,
   unsigned ELEN = STI->hasFeature(RISCV::FeatureStdExtZve64x) ? 64 : 32;
   unsigned MinLMUL = ELEN / 8;
   if (Lmul > MinLMUL)
-Warning(
-Tok.getLoc(),
-Twine("The use of vtype encodings with LMUL < SEWMIN/ELEN == mf") +
-Twine(MinLMUL) + Twine(" is reserved"));
+Warning(Tok.getLoc(),
+"use of vtype encodings with LMUL < SEWMIN/ELEN == mf" +
+Twine(MinLMUL) + " is reserved");
 }
 
 State = VTypeState_TailPolicy;
@@ -2228,12 +2227,10 @@ ParseStatus RISCVAsmParser::parseVTypeI(OperandVector 
&Operands) {
   unsigned MaxSEW = ELEN / Lmul;
   // If MaxSEW < 8, we should have printed warning about reserved LMUL.
   if (MaxSEW >= 8 && Sew > MaxSEW)
-Warning(
-SEWLoc,
-Twine("The use of vtype encodings with SEW > ") + Twine(MaxSEW) +
-Twine(" and LMUL == ") + Twine(Fractional ? "mf" : "m") +
-Twine(Lmul) +
-Twine(" may not be compatible with all RVV implementations"));
+Warning(SEWLoc,
+"use of vtype encodings with SEW > " + Twine(MaxSEW) +
+" and LMUL == " + (Fractional ? "mf" : "m") + Twine(Lmul) +
+" may not be compatible with all RVV implementations");
 }
 
 unsigned VTypeI =
diff --git a/llvm/test/MC/RISCV/rvv/vsetvl.s b/llvm/test/MC/RISCV/rvv/vsetvl.s
index 207daf392bd50..2741def0eeff2 100644
--- a/llvm/test/MC/RISCV/rvv/vsetvl.s
+++ b/llvm/test/MC/RISCV/rvv/vsetvl.s
@@ -73,21 +73,21 @@ vsetvli a2, a0, e32, m8, ta, ma
 
 vsetvli a2, a0, e32, mf2, ta, ma
 # CHECK-INST: vsetvli a2, a0, e32, mf2, ta, ma
-# CHECK-ZVE32X: :[[#@LINE-2]]:17: warning: The use of vtype encodings with SEW 
> 16 and LMUL == mf2 may not be compatible with all RVV implementations{{$}}
+# CHECK-ZVE32X: :[[#@LINE-2]]:17: warning: use of vtype encodings with SEW > 
16 and LMUL == mf2 may not be compatible with all RVV implementations{{$}}
 # CHECK-ENCODING: [0x57,0x76,0x75,0x0d]
 # CHECK-ERROR: instruction requires the following: 'V' (Vector Extension for 
Application Processors), 'Zve32x' (Vector Extensions for Embedded 
Processors){{$}}
 # CHECK-UNKNOWN: 0d757657 
 
 vsetvli a2, a0, e32, mf4, ta, ma
 # CHECK-INST: vsetvli a2, a0, e32, mf4, ta, ma
-# CHECK-ZVE32X: :[[#@LINE-2]]:17: warning: The use of vtype encodings with SEW 
> 8 and LMUL == mf4 may not be compatible with all RVV implementations{{$}}
+# CHECK-ZVE32X: :[[#@LINE-2]]:17: warning: use of vtype encodings with SEW > 8 
and LMUL == mf4 may not be compatible with all RVV implementations{{$}}
 # CHECK-ENCODING: [0x57,0x76,0x65,0x0d]
 # CHECK-ERROR: instruction requires the following: 'V' (Vector Extension for 
Application Processors), 'Zve32x' (Vector Extensions for Embedded 
Processors){{$}}
 # CHECK-UNKNOWN: 0d657657 
 
 vsetvli a2, a0, e32, mf8, ta, ma
 # CHECK-INST: vsetvli a2, a0, e32, mf8, ta, ma
-# CHECK-ZVE32X: :[[#@LINE-2]]:22: warning: The use of vtype encodings with 
LMUL < SEWMIN/ELEN == mf4 is reserved{{$}}
+# CHECK-ZVE32X: :[[#@LINE-2]]:22: warning: use of vtype encodings with LMUL < 
SEWMIN/ELEN == mf4 is reserved{{$}}
 # CHECK-ENCODING: [0x57,0x76,0x55,0x0d]
 # CHECK-ERROR: instruction requires the following: 'V' (Vector Extension for 
Application Processors), 'Zve32x' (Vector Extensions for Embedded 
Processors){{$}}
 # CHECK-UNKNOWN: 0d557657 

>From 6b44742cfcc24a07408bbe20070f57ebaa4e9066 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Fri, 7 Jun 2024 11:27:13 +0800
Subject: [PATCH 2/2] Remove Fractional

Created using spr 1.3.6-beta.1
---
 llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp 
b/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
index 49f21e46f34ea..ca11d155ec7c6 100644
--- a/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
+++ b/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
@@ -2229,7 +2229,7 @@ ParseStatus RISCVAsmParser::parseVTypeI(OperandVector 
&Operands) {
   if (MaxSEW >= 8 && Sew > MaxSEW)

[llvm-branch-commits] [llvm] [RISCV] Support select optimization (PR #80124)

2024-06-26 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/80124

>From e3fb1fe7bdd4b7c24f9361c4d14dd1206fc8c067 Mon Sep 17 00:00:00 2001
From: wangpc 
Date: Sun, 18 Feb 2024 11:12:16 +0800
Subject: [PATCH 1/2] Move after addIRPasses

Created using spr 1.3.4
---
 llvm/lib/Target/RISCV/RISCVTargetMachine.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp 
b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
index fdf1c023fff87..7a26e1956424c 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
@@ -450,15 +450,15 @@ void RISCVPassConfig::addIRPasses() {
 if (EnableLoopDataPrefetch)
   addPass(createLoopDataPrefetchPass());
 
-if (EnableSelectOpt && getOptLevel() == CodeGenOptLevel::Aggressive)
-  addPass(createSelectOptimizePass());
-
 addPass(createRISCVGatherScatterLoweringPass());
 addPass(createInterleavedAccessPass());
 addPass(createRISCVCodeGenPreparePass());
   }
 
   TargetPassConfig::addIRPasses();
+
+  if (getOptLevel() == CodeGenOptLevel::Aggressive && EnableSelectOpt)
+addPass(createSelectOptimizePass());
 }
 
 bool RISCVPassConfig::addPreISel() {

>From 5d5398596dc30c47c67572ec20137fb3f9434940 Mon Sep 17 00:00:00 2001
From: wangpc 
Date: Wed, 21 Feb 2024 21:21:28 +0800
Subject: [PATCH 2/2] Fix test

Created using spr 1.3.4
---
 llvm/test/CodeGen/RISCV/O3-pipeline.ll | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/llvm/test/CodeGen/RISCV/O3-pipeline.ll 
b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
index 62c1af52e6c20..8b52e3fe7b2f1 100644
--- a/llvm/test/CodeGen/RISCV/O3-pipeline.ll
+++ b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
@@ -34,15 +34,6 @@
 ; CHECK-NEXT:   Optimization Remark Emitter
 ; CHECK-NEXT:   Scalar Evolution Analysis
 ; CHECK-NEXT:   Loop Data Prefetch
-; CHECK-NEXT:   Post-Dominator Tree Construction
-; CHECK-NEXT:   Branch Probability Analysis
-; CHECK-NEXT:   Block Frequency Analysis
-; CHECK-NEXT:   Lazy Branch Probability Analysis
-; CHECK-NEXT:   Lazy Block Frequency Analysis
-; CHECK-NEXT:   Optimization Remark Emitter
-; CHECK-NEXT:   Optimize selects
-; CHECK-NEXT:   Dominator Tree Construction
-; CHECK-NEXT:   Natural Loop Information
 ; CHECK-NEXT:   RISC-V gather/scatter lowering
 ; CHECK-NEXT:   Interleaved Access Pass
 ; CHECK-NEXT:   RISC-V CodeGenPrepare
@@ -77,6 +68,15 @@
 ; CHECK-NEXT:   Expand reduction intrinsics
 ; CHECK-NEXT:   Natural Loop Information
 ; CHECK-NEXT:   TLS Variable Hoist
+; CHECK-NEXT:   Post-Dominator Tree Construction
+; CHECK-NEXT:   Branch Probability Analysis
+; CHECK-NEXT:   Block Frequency Analysis
+; CHECK-NEXT:   Lazy Branch Probability Analysis
+; CHECK-NEXT:   Lazy Block Frequency Analysis
+; CHECK-NEXT:   Optimization Remark Emitter
+; CHECK-NEXT:   Optimize selects
+; CHECK-NEXT:   Dominator Tree Construction
+; CHECK-NEXT:   Natural Loop Information
 ; CHECK-NEXT:   CodeGen Prepare
 ; CHECK-NEXT:   Dominator Tree Construction
 ; CHECK-NEXT:   Exception handling preparation

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Support select optimization (PR #80124)

2024-06-26 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/80124

>From e3fb1fe7bdd4b7c24f9361c4d14dd1206fc8c067 Mon Sep 17 00:00:00 2001
From: wangpc 
Date: Sun, 18 Feb 2024 11:12:16 +0800
Subject: [PATCH 1/2] Move after addIRPasses

Created using spr 1.3.4
---
 llvm/lib/Target/RISCV/RISCVTargetMachine.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp 
b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
index fdf1c023fff87..7a26e1956424c 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
@@ -450,15 +450,15 @@ void RISCVPassConfig::addIRPasses() {
 if (EnableLoopDataPrefetch)
   addPass(createLoopDataPrefetchPass());
 
-if (EnableSelectOpt && getOptLevel() == CodeGenOptLevel::Aggressive)
-  addPass(createSelectOptimizePass());
-
 addPass(createRISCVGatherScatterLoweringPass());
 addPass(createInterleavedAccessPass());
 addPass(createRISCVCodeGenPreparePass());
   }
 
   TargetPassConfig::addIRPasses();
+
+  if (getOptLevel() == CodeGenOptLevel::Aggressive && EnableSelectOpt)
+addPass(createSelectOptimizePass());
 }
 
 bool RISCVPassConfig::addPreISel() {

>From 5d5398596dc30c47c67572ec20137fb3f9434940 Mon Sep 17 00:00:00 2001
From: wangpc 
Date: Wed, 21 Feb 2024 21:21:28 +0800
Subject: [PATCH 2/2] Fix test

Created using spr 1.3.4
---
 llvm/test/CodeGen/RISCV/O3-pipeline.ll | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/llvm/test/CodeGen/RISCV/O3-pipeline.ll 
b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
index 62c1af52e6c20..8b52e3fe7b2f1 100644
--- a/llvm/test/CodeGen/RISCV/O3-pipeline.ll
+++ b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
@@ -34,15 +34,6 @@
 ; CHECK-NEXT:   Optimization Remark Emitter
 ; CHECK-NEXT:   Scalar Evolution Analysis
 ; CHECK-NEXT:   Loop Data Prefetch
-; CHECK-NEXT:   Post-Dominator Tree Construction
-; CHECK-NEXT:   Branch Probability Analysis
-; CHECK-NEXT:   Block Frequency Analysis
-; CHECK-NEXT:   Lazy Branch Probability Analysis
-; CHECK-NEXT:   Lazy Block Frequency Analysis
-; CHECK-NEXT:   Optimization Remark Emitter
-; CHECK-NEXT:   Optimize selects
-; CHECK-NEXT:   Dominator Tree Construction
-; CHECK-NEXT:   Natural Loop Information
 ; CHECK-NEXT:   RISC-V gather/scatter lowering
 ; CHECK-NEXT:   Interleaved Access Pass
 ; CHECK-NEXT:   RISC-V CodeGenPrepare
@@ -77,6 +68,15 @@
 ; CHECK-NEXT:   Expand reduction intrinsics
 ; CHECK-NEXT:   Natural Loop Information
 ; CHECK-NEXT:   TLS Variable Hoist
+; CHECK-NEXT:   Post-Dominator Tree Construction
+; CHECK-NEXT:   Branch Probability Analysis
+; CHECK-NEXT:   Block Frequency Analysis
+; CHECK-NEXT:   Lazy Branch Probability Analysis
+; CHECK-NEXT:   Lazy Block Frequency Analysis
+; CHECK-NEXT:   Optimization Remark Emitter
+; CHECK-NEXT:   Optimize selects
+; CHECK-NEXT:   Dominator Tree Construction
+; CHECK-NEXT:   Natural Loop Information
 ; CHECK-NEXT:   CodeGen Prepare
 ; CHECK-NEXT:   Dominator Tree Construction
 ; CHECK-NEXT:   Exception handling preparation

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [NFC][RISCV] Simplify the dynamic linker construction logic (PR #97383)

2024-07-01 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp created 
https://github.com/llvm/llvm-project/pull/97383

The format of dynamic linker is `ld-linux-{arch}-{abi}.so.1`, so
we can just get the arch name from arch type.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Support select optimization (PR #80124)

2024-07-04 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/80124

>From e3fb1fe7bdd4b7c24f9361c4d14dd1206fc8c067 Mon Sep 17 00:00:00 2001
From: wangpc 
Date: Sun, 18 Feb 2024 11:12:16 +0800
Subject: [PATCH 1/2] Move after addIRPasses

Created using spr 1.3.4
---
 llvm/lib/Target/RISCV/RISCVTargetMachine.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp 
b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
index fdf1c023fff87..7a26e1956424c 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
@@ -450,15 +450,15 @@ void RISCVPassConfig::addIRPasses() {
 if (EnableLoopDataPrefetch)
   addPass(createLoopDataPrefetchPass());
 
-if (EnableSelectOpt && getOptLevel() == CodeGenOptLevel::Aggressive)
-  addPass(createSelectOptimizePass());
-
 addPass(createRISCVGatherScatterLoweringPass());
 addPass(createInterleavedAccessPass());
 addPass(createRISCVCodeGenPreparePass());
   }
 
   TargetPassConfig::addIRPasses();
+
+  if (getOptLevel() == CodeGenOptLevel::Aggressive && EnableSelectOpt)
+addPass(createSelectOptimizePass());
 }
 
 bool RISCVPassConfig::addPreISel() {

>From 5d5398596dc30c47c67572ec20137fb3f9434940 Mon Sep 17 00:00:00 2001
From: wangpc 
Date: Wed, 21 Feb 2024 21:21:28 +0800
Subject: [PATCH 2/2] Fix test

Created using spr 1.3.4
---
 llvm/test/CodeGen/RISCV/O3-pipeline.ll | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/llvm/test/CodeGen/RISCV/O3-pipeline.ll 
b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
index 62c1af52e6c20..8b52e3fe7b2f1 100644
--- a/llvm/test/CodeGen/RISCV/O3-pipeline.ll
+++ b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
@@ -34,15 +34,6 @@
 ; CHECK-NEXT:   Optimization Remark Emitter
 ; CHECK-NEXT:   Scalar Evolution Analysis
 ; CHECK-NEXT:   Loop Data Prefetch
-; CHECK-NEXT:   Post-Dominator Tree Construction
-; CHECK-NEXT:   Branch Probability Analysis
-; CHECK-NEXT:   Block Frequency Analysis
-; CHECK-NEXT:   Lazy Branch Probability Analysis
-; CHECK-NEXT:   Lazy Block Frequency Analysis
-; CHECK-NEXT:   Optimization Remark Emitter
-; CHECK-NEXT:   Optimize selects
-; CHECK-NEXT:   Dominator Tree Construction
-; CHECK-NEXT:   Natural Loop Information
 ; CHECK-NEXT:   RISC-V gather/scatter lowering
 ; CHECK-NEXT:   Interleaved Access Pass
 ; CHECK-NEXT:   RISC-V CodeGenPrepare
@@ -77,6 +68,15 @@
 ; CHECK-NEXT:   Expand reduction intrinsics
 ; CHECK-NEXT:   Natural Loop Information
 ; CHECK-NEXT:   TLS Variable Hoist
+; CHECK-NEXT:   Post-Dominator Tree Construction
+; CHECK-NEXT:   Branch Probability Analysis
+; CHECK-NEXT:   Block Frequency Analysis
+; CHECK-NEXT:   Lazy Branch Probability Analysis
+; CHECK-NEXT:   Lazy Block Frequency Analysis
+; CHECK-NEXT:   Optimization Remark Emitter
+; CHECK-NEXT:   Optimize selects
+; CHECK-NEXT:   Dominator Tree Construction
+; CHECK-NEXT:   Natural Loop Information
 ; CHECK-NEXT:   CodeGen Prepare
 ; CHECK-NEXT:   Dominator Tree Construction
 ; CHECK-NEXT:   Exception handling preparation

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Support select optimization (PR #80124)

2024-07-04 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/80124

>From e3fb1fe7bdd4b7c24f9361c4d14dd1206fc8c067 Mon Sep 17 00:00:00 2001
From: wangpc 
Date: Sun, 18 Feb 2024 11:12:16 +0800
Subject: [PATCH 1/2] Move after addIRPasses

Created using spr 1.3.4
---
 llvm/lib/Target/RISCV/RISCVTargetMachine.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp 
b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
index fdf1c023fff87..7a26e1956424c 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
@@ -450,15 +450,15 @@ void RISCVPassConfig::addIRPasses() {
 if (EnableLoopDataPrefetch)
   addPass(createLoopDataPrefetchPass());
 
-if (EnableSelectOpt && getOptLevel() == CodeGenOptLevel::Aggressive)
-  addPass(createSelectOptimizePass());
-
 addPass(createRISCVGatherScatterLoweringPass());
 addPass(createInterleavedAccessPass());
 addPass(createRISCVCodeGenPreparePass());
   }
 
   TargetPassConfig::addIRPasses();
+
+  if (getOptLevel() == CodeGenOptLevel::Aggressive && EnableSelectOpt)
+addPass(createSelectOptimizePass());
 }
 
 bool RISCVPassConfig::addPreISel() {

>From 5d5398596dc30c47c67572ec20137fb3f9434940 Mon Sep 17 00:00:00 2001
From: wangpc 
Date: Wed, 21 Feb 2024 21:21:28 +0800
Subject: [PATCH 2/2] Fix test

Created using spr 1.3.4
---
 llvm/test/CodeGen/RISCV/O3-pipeline.ll | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/llvm/test/CodeGen/RISCV/O3-pipeline.ll 
b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
index 62c1af52e6c20..8b52e3fe7b2f1 100644
--- a/llvm/test/CodeGen/RISCV/O3-pipeline.ll
+++ b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
@@ -34,15 +34,6 @@
 ; CHECK-NEXT:   Optimization Remark Emitter
 ; CHECK-NEXT:   Scalar Evolution Analysis
 ; CHECK-NEXT:   Loop Data Prefetch
-; CHECK-NEXT:   Post-Dominator Tree Construction
-; CHECK-NEXT:   Branch Probability Analysis
-; CHECK-NEXT:   Block Frequency Analysis
-; CHECK-NEXT:   Lazy Branch Probability Analysis
-; CHECK-NEXT:   Lazy Block Frequency Analysis
-; CHECK-NEXT:   Optimization Remark Emitter
-; CHECK-NEXT:   Optimize selects
-; CHECK-NEXT:   Dominator Tree Construction
-; CHECK-NEXT:   Natural Loop Information
 ; CHECK-NEXT:   RISC-V gather/scatter lowering
 ; CHECK-NEXT:   Interleaved Access Pass
 ; CHECK-NEXT:   RISC-V CodeGenPrepare
@@ -77,6 +68,15 @@
 ; CHECK-NEXT:   Expand reduction intrinsics
 ; CHECK-NEXT:   Natural Loop Information
 ; CHECK-NEXT:   TLS Variable Hoist
+; CHECK-NEXT:   Post-Dominator Tree Construction
+; CHECK-NEXT:   Branch Probability Analysis
+; CHECK-NEXT:   Block Frequency Analysis
+; CHECK-NEXT:   Lazy Branch Probability Analysis
+; CHECK-NEXT:   Lazy Block Frequency Analysis
+; CHECK-NEXT:   Optimization Remark Emitter
+; CHECK-NEXT:   Optimize selects
+; CHECK-NEXT:   Dominator Tree Construction
+; CHECK-NEXT:   Natural Loop Information
 ; CHECK-NEXT:   CodeGen Prepare
 ; CHECK-NEXT:   Dominator Tree Construction
 ; CHECK-NEXT:   Exception handling preparation

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Support select optimization (PR #80124)

2024-07-04 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp edited 
https://github.com/llvm/llvm-project/pull/80124
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Support select optimization (PR #80124)

2024-07-04 Thread Pengcheng Wang via llvm-branch-commits


wangpc-pp wrote:

Ping.
I'd like to push this forward because we don't take branch probabilities into 
consideration now.
Example: https://godbolt.org/z/doGhYadKM
We should use branches instead of selects in this case and this patch (the 
enabling of SelectOpt) will optimize this.
`clang -O3 -march=rv64gc_zba_zbb_zbc_zbs_zicond -Xclang -target-feature -Xclang 
+enable-select-opt -Xclang -target-feature -Xclang 
+predictable-select-expensive`
```
New Select group with
%. = select i1 %cmp, i32 5, i32 13, !prof !9
Analyzing select group containing   %. = select i1 %cmp, i32 5, i32 13, !prof !9
Converted to branch because of highly predictable branch.
```
```asm
func0:  # @func0
li  a2, 5
mul a1, a0, a0
bge a2, a0, .LBB0_2
addwa0, a1, a2
ret
.LBB0_2:# %select.false
li  a2, 13
addwa0, a1, a2
ret
```

https://github.com/llvm/llvm-project/pull/80124
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Support select optimization (PR #80124)

2024-07-04 Thread Pengcheng Wang via llvm-branch-commits


wangpc-pp wrote:

https://github.com/llvm/llvm-project/pull/97708 is splitted out for adding 
`FeaturePredictableSelectIsExpensive`.

https://github.com/llvm/llvm-project/pull/80124
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Support select optimization (PR #80124)

2024-07-05 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/80124

>From e3fb1fe7bdd4b7c24f9361c4d14dd1206fc8c067 Mon Sep 17 00:00:00 2001
From: wangpc 
Date: Sun, 18 Feb 2024 11:12:16 +0800
Subject: [PATCH 1/2] Move after addIRPasses

Created using spr 1.3.4
---
 llvm/lib/Target/RISCV/RISCVTargetMachine.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp 
b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
index fdf1c023fff87..7a26e1956424c 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
@@ -450,15 +450,15 @@ void RISCVPassConfig::addIRPasses() {
 if (EnableLoopDataPrefetch)
   addPass(createLoopDataPrefetchPass());
 
-if (EnableSelectOpt && getOptLevel() == CodeGenOptLevel::Aggressive)
-  addPass(createSelectOptimizePass());
-
 addPass(createRISCVGatherScatterLoweringPass());
 addPass(createInterleavedAccessPass());
 addPass(createRISCVCodeGenPreparePass());
   }
 
   TargetPassConfig::addIRPasses();
+
+  if (getOptLevel() == CodeGenOptLevel::Aggressive && EnableSelectOpt)
+addPass(createSelectOptimizePass());
 }
 
 bool RISCVPassConfig::addPreISel() {

>From 5d5398596dc30c47c67572ec20137fb3f9434940 Mon Sep 17 00:00:00 2001
From: wangpc 
Date: Wed, 21 Feb 2024 21:21:28 +0800
Subject: [PATCH 2/2] Fix test

Created using spr 1.3.4
---
 llvm/test/CodeGen/RISCV/O3-pipeline.ll | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/llvm/test/CodeGen/RISCV/O3-pipeline.ll 
b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
index 62c1af52e6c20..8b52e3fe7b2f1 100644
--- a/llvm/test/CodeGen/RISCV/O3-pipeline.ll
+++ b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
@@ -34,15 +34,6 @@
 ; CHECK-NEXT:   Optimization Remark Emitter
 ; CHECK-NEXT:   Scalar Evolution Analysis
 ; CHECK-NEXT:   Loop Data Prefetch
-; CHECK-NEXT:   Post-Dominator Tree Construction
-; CHECK-NEXT:   Branch Probability Analysis
-; CHECK-NEXT:   Block Frequency Analysis
-; CHECK-NEXT:   Lazy Branch Probability Analysis
-; CHECK-NEXT:   Lazy Block Frequency Analysis
-; CHECK-NEXT:   Optimization Remark Emitter
-; CHECK-NEXT:   Optimize selects
-; CHECK-NEXT:   Dominator Tree Construction
-; CHECK-NEXT:   Natural Loop Information
 ; CHECK-NEXT:   RISC-V gather/scatter lowering
 ; CHECK-NEXT:   Interleaved Access Pass
 ; CHECK-NEXT:   RISC-V CodeGenPrepare
@@ -77,6 +68,15 @@
 ; CHECK-NEXT:   Expand reduction intrinsics
 ; CHECK-NEXT:   Natural Loop Information
 ; CHECK-NEXT:   TLS Variable Hoist
+; CHECK-NEXT:   Post-Dominator Tree Construction
+; CHECK-NEXT:   Branch Probability Analysis
+; CHECK-NEXT:   Block Frequency Analysis
+; CHECK-NEXT:   Lazy Branch Probability Analysis
+; CHECK-NEXT:   Lazy Block Frequency Analysis
+; CHECK-NEXT:   Optimization Remark Emitter
+; CHECK-NEXT:   Optimize selects
+; CHECK-NEXT:   Dominator Tree Construction
+; CHECK-NEXT:   Natural Loop Information
 ; CHECK-NEXT:   CodeGen Prepare
 ; CHECK-NEXT:   Dominator Tree Construction
 ; CHECK-NEXT:   Exception handling preparation

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Support select optimization (PR #80124)

2024-07-05 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/80124

>From e3fb1fe7bdd4b7c24f9361c4d14dd1206fc8c067 Mon Sep 17 00:00:00 2001
From: wangpc 
Date: Sun, 18 Feb 2024 11:12:16 +0800
Subject: [PATCH 1/2] Move after addIRPasses

Created using spr 1.3.4
---
 llvm/lib/Target/RISCV/RISCVTargetMachine.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp 
b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
index fdf1c023fff878..7a26e1956424cb 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
@@ -450,15 +450,15 @@ void RISCVPassConfig::addIRPasses() {
 if (EnableLoopDataPrefetch)
   addPass(createLoopDataPrefetchPass());
 
-if (EnableSelectOpt && getOptLevel() == CodeGenOptLevel::Aggressive)
-  addPass(createSelectOptimizePass());
-
 addPass(createRISCVGatherScatterLoweringPass());
 addPass(createInterleavedAccessPass());
 addPass(createRISCVCodeGenPreparePass());
   }
 
   TargetPassConfig::addIRPasses();
+
+  if (getOptLevel() == CodeGenOptLevel::Aggressive && EnableSelectOpt)
+addPass(createSelectOptimizePass());
 }
 
 bool RISCVPassConfig::addPreISel() {

>From 5d5398596dc30c47c67572ec20137fb3f9434940 Mon Sep 17 00:00:00 2001
From: wangpc 
Date: Wed, 21 Feb 2024 21:21:28 +0800
Subject: [PATCH 2/2] Fix test

Created using spr 1.3.4
---
 llvm/test/CodeGen/RISCV/O3-pipeline.ll | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/llvm/test/CodeGen/RISCV/O3-pipeline.ll 
b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
index 62c1af52e6c20e..8b52e3fe7b2f15 100644
--- a/llvm/test/CodeGen/RISCV/O3-pipeline.ll
+++ b/llvm/test/CodeGen/RISCV/O3-pipeline.ll
@@ -34,15 +34,6 @@
 ; CHECK-NEXT:   Optimization Remark Emitter
 ; CHECK-NEXT:   Scalar Evolution Analysis
 ; CHECK-NEXT:   Loop Data Prefetch
-; CHECK-NEXT:   Post-Dominator Tree Construction
-; CHECK-NEXT:   Branch Probability Analysis
-; CHECK-NEXT:   Block Frequency Analysis
-; CHECK-NEXT:   Lazy Branch Probability Analysis
-; CHECK-NEXT:   Lazy Block Frequency Analysis
-; CHECK-NEXT:   Optimization Remark Emitter
-; CHECK-NEXT:   Optimize selects
-; CHECK-NEXT:   Dominator Tree Construction
-; CHECK-NEXT:   Natural Loop Information
 ; CHECK-NEXT:   RISC-V gather/scatter lowering
 ; CHECK-NEXT:   Interleaved Access Pass
 ; CHECK-NEXT:   RISC-V CodeGenPrepare
@@ -77,6 +68,15 @@
 ; CHECK-NEXT:   Expand reduction intrinsics
 ; CHECK-NEXT:   Natural Loop Information
 ; CHECK-NEXT:   TLS Variable Hoist
+; CHECK-NEXT:   Post-Dominator Tree Construction
+; CHECK-NEXT:   Branch Probability Analysis
+; CHECK-NEXT:   Block Frequency Analysis
+; CHECK-NEXT:   Lazy Branch Probability Analysis
+; CHECK-NEXT:   Lazy Block Frequency Analysis
+; CHECK-NEXT:   Optimization Remark Emitter
+; CHECK-NEXT:   Optimize selects
+; CHECK-NEXT:   Dominator Tree Construction
+; CHECK-NEXT:   Natural Loop Information
 ; CHECK-NEXT:   CodeGen Prepare
 ; CHECK-NEXT:   Dominator Tree Construction
 ; CHECK-NEXT:   Exception handling preparation

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [CodeGen] Add dump() to MachineTraceMetrics.h (PR #97799)

2024-07-05 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp created 
https://github.com/llvm/llvm-project/pull/97799

To enhance debugging.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [EarlyIfCvt] Take branch probablities into consideration (PR #97808)

2024-07-08 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp edited 
https://github.com/llvm/llvm-project/pull/97808
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [EarlyIfCvt] Take branch probablities into consideration (PR #97808)

2024-07-08 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/97808

>From 32227cf222e28b6dc2a514aabfaff1a579f51668 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Fri, 5 Jul 2024 18:19:58 +0800
Subject: [PATCH] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20initia?=
 =?UTF-8?q?l=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.6-beta.1
---
 llvm/lib/CodeGen/EarlyIfConversion.cpp   | 92 +++-
 llvm/lib/Target/AArch64/AArch64InstrInfo.cpp | 13 +++
 llvm/lib/Target/AArch64/AArch64InstrInfo.h   |  7 ++
 llvm/lib/Target/AMDGPU/SIInstrInfo.cpp   | 14 +++
 llvm/lib/Target/AMDGPU/SIInstrInfo.h |  7 ++
 llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp | 16 ++--
 llvm/lib/Target/X86/X86InstrInfo.cpp | 14 +++
 llvm/lib/Target/X86/X86InstrInfo.h   |  7 ++
 8 files changed, 123 insertions(+), 47 deletions(-)

diff --git a/llvm/lib/CodeGen/EarlyIfConversion.cpp 
b/llvm/lib/CodeGen/EarlyIfConversion.cpp
index 5f3e85077cb56..fc94879c90829 100644
--- a/llvm/lib/CodeGen/EarlyIfConversion.cpp
+++ b/llvm/lib/CodeGen/EarlyIfConversion.cpp
@@ -180,6 +180,9 @@ class SSAIfConv {
   /// speculatively execute it.
   bool canConvertIf(MachineBasicBlock *MBB, bool Predicate = false);
 
+  bool isProfitable(const MachineBranchProbabilityInfo *MBPI,
+TargetSchedModel SchedModel);
+
   /// convertIf - If-convert the last block passed to canConvertIf(), assuming
   /// it is possible. Add any erased blocks to RemovedBlocks.
   void convertIf(SmallVectorImpl &RemovedBlocks,
@@ -561,6 +564,45 @@ bool SSAIfConv::canConvertIf(MachineBasicBlock *MBB, bool 
Predicate) {
   return true;
 }
 
+/// Apply the target heuristic to decide if the transformation is profitable.
+bool SSAIfConv::isProfitable(const MachineBranchProbabilityInfo *MBPI,
+ TargetSchedModel SchedModel) {
+  auto TrueProbability = MBPI->getEdgeProbability(Head, TBB);
+  if (isTriangle()) {
+MachineBasicBlock &IfBlock = (TBB == Tail) ? *FBB : *TBB;
+
+unsigned ExtraPredCost = 0;
+unsigned Cycles = 0;
+for (MachineInstr &I : IfBlock) {
+  unsigned NumCycles = SchedModel.computeInstrLatency(&I, false);
+  if (NumCycles > 1)
+Cycles += NumCycles - 1;
+  ExtraPredCost += TII->getPredicationCost(I);
+}
+
+return TII->isProfitableToIfCvt(IfBlock, Cycles, ExtraPredCost,
+TrueProbability);
+  }
+  unsigned TExtra = 0;
+  unsigned FExtra = 0;
+  unsigned TCycle = 0;
+  unsigned FCycle = 0;
+  for (MachineInstr &I : *TBB) {
+unsigned NumCycles = SchedModel.computeInstrLatency(&I, false);
+if (NumCycles > 1)
+  TCycle += NumCycles - 1;
+TExtra += TII->getPredicationCost(I);
+  }
+  for (MachineInstr &I : *FBB) {
+unsigned NumCycles = SchedModel.computeInstrLatency(&I, false);
+if (NumCycles > 1)
+  FCycle += NumCycles - 1;
+FExtra += TII->getPredicationCost(I);
+  }
+  return TII->isProfitableToIfCvt(*TBB, TCycle, TExtra, *FBB, FCycle, FExtra,
+  TrueProbability);
+}
+
 /// \return true iff the two registers are known to have the same value.
 static bool hasSameValue(const MachineRegisterInfo &MRI,
  const TargetInstrInfo *TII, Register TReg,
@@ -760,9 +802,11 @@ namespace {
 class EarlyIfConverter : public MachineFunctionPass {
   const TargetInstrInfo *TII = nullptr;
   const TargetRegisterInfo *TRI = nullptr;
-  MCSchedModel SchedModel;
+  MCSchedModel MCSchedModel;
+  TargetSchedModel TargetSchedModel;
   MachineRegisterInfo *MRI = nullptr;
   MachineDominatorTree *DomTree = nullptr;
+  MachineBranchProbabilityInfo *MBPI = nullptr;
   MachineLoopInfo *Loops = nullptr;
   MachineTraceMetrics *Traces = nullptr;
   MachineTraceMetrics::Ensemble *MinInstr = nullptr;
@@ -871,6 +915,9 @@ bool EarlyIfConverter::shouldConvertIf() {
 
   // Do not try to if-convert if the condition has a high chance of being
   // predictable.
+  if (!IfConv.isProfitable(MBPI, TargetSchedModel))
+return false;
+
   MachineLoop *CurrentLoop = Loops->getLoopFor(IfConv.Head);
   // If the condition is in a loop, consider it predictable if the condition
   // itself or all its operands are loop-invariant. E.g. this considers a load
@@ -911,7 +958,7 @@ bool EarlyIfConverter::shouldConvertIf() {
   FBBTrace.getCriticalPath());
 
   // Set a somewhat arbitrary limit on the critical path extension we accept.
-  unsigned CritLimit = SchedModel.MispredictPenalty/2;
+  unsigned CritLimit = MCSchedModel.MispredictPenalty / 2;
 
   MachineBasicBlock &MBB = *IfConv.Head;
   MachineOptimizationRemarkEmitter MORE(*MBB.getParent(), nullptr);
@@ -1084,9 +1131,11 @@ bool 
EarlyIfConverter::runOnMachineFunction(MachineFunction &MF) {
 
   TII = STI.getInstrInfo();
   TRI = STI.getRegisterInfo();
-  SchedModel = STI.getSchedModel();
+  MCSched

[llvm-branch-commits] [llvm] [EarlyIfCvt] Take branch probablities into consideration (PR #97808)

2024-07-10 Thread Pengcheng Wang via llvm-branch-commits

wangpc-pp wrote:

> > [EarlyIfCvt] Take branch probablities into consideration
> 
> It looks like this MR is only adding a target hook, so this title doesn't 
> make sense to me

I was planning to add support to RISCV target, but it depends on your early 
if-conversion patch. I will hold this PR until RISC-V has got its early 
if-conversion support.

https://github.com/llvm/llvm-project/pull/97808
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [EarlyIfCvt] Take branch probablities into consideration (PR #97808)

2024-07-10 Thread Pengcheng Wang via llvm-branch-commits



@@ -913,6 +913,10 @@ class TargetInstrInfo : public MCInstrInfo {
 return false;
   }
 
+  /// Return true if the target will always try to convert predictable branches
+  /// to selects.
+  virtual bool shouldConvertPredictableBranches() const { return true; }
+

wangpc-pp wrote:

I think the current behavior is we will convert predictable branches. To not 
touch too much targets, I leave it to be true here.

https://github.com/llvm/llvm-project/pull/97808
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/19.x: [RISCV] Fix InsnCI register type (#100113) (PR #100306)

2024-07-24 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp approved this pull request.


https://github.com/llvm/llvm-project/pull/100306
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/19.x: [RISCV] Don't crash in RISCVMergeBaseOffset if INLINE_ASM uses address register in a non-memory constraint. (#100790) (PR #100843)

2024-07-26 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp approved this pull request.

LGTM.

https://github.com/llvm/llvm-project/pull/100843
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Use larger copies when register tuples are aligned (PR #84455)

2024-04-01 Thread Pengcheng Wang via llvm-branch-commits


wangpc-pp wrote:

Ping. Are there any more comments?

https://github.com/llvm/llvm-project/pull/84455
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Use larger copies when register tuples are aligned (PR #84455)

2024-04-02 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/84455

>From 35d0ea085b43a67c092e6263e6ec9d34e66e1453 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Tue, 12 Mar 2024 17:31:47 +0800
Subject: [PATCH 1/6] Reduce copies

Created using spr 1.3.4
---
 llvm/lib/Target/RISCV/RISCVInstrInfo.cpp |  89 +-
 llvm/test/CodeGen/RISCV/rvv/vmv-copy.mir |  30 +---
 llvm/test/CodeGen/RISCV/rvv/zvlsseg-copy.mir | 175 +++
 3 files changed, 106 insertions(+), 188 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index 7895e87702c711..9fe5666d6a81f4 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -302,58 +302,38 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock 
&MBB,
RISCVII::VLMUL LMul, unsigned NF) const 
{
   const TargetRegisterInfo *TRI = STI.getRegisterInfo();
 
-  int I = 0, End = NF, Incr = 1;
   unsigned SrcEncoding = TRI->getEncodingValue(SrcReg);
   unsigned DstEncoding = TRI->getEncodingValue(DstReg);
   unsigned LMulVal;
   bool Fractional;
   std::tie(LMulVal, Fractional) = RISCVVType::decodeVLMUL(LMul);
   assert(!Fractional && "It is impossible be fractional lmul here.");
-  if (forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NF * LMulVal)) {
-I = NF - 1;
-End = -1;
-Incr = -1;
-  }
+  unsigned NumRegs = NF * LMulVal;
+  bool ReversedCopy =
+  forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NumRegs);
 
-  for (; I != End; I += Incr) {
+  unsigned I = 0;
+  while (I != NumRegs) {
 auto GetCopyInfo =
-[](RISCVII::VLMUL LMul,unsigned NF) -> std::tuple {
-  unsigned Opc;
-  unsigned SubRegIdx;
-  unsigned VVOpc, VIOpc;
-  switch (LMul) {
-  default:
-llvm_unreachable("Impossible LMUL for vector register copy.");
-  case RISCVII::LMUL_1:
-Opc = RISCV::VMV1R_V;
-SubRegIdx = RISCV::sub_vrm1_0;
-VVOpc = RISCV::PseudoVMV_V_V_M1;
-VIOpc = RISCV::PseudoVMV_V_I_M1;
-break;
-  case RISCVII::LMUL_2:
-Opc = RISCV::VMV2R_V;
-SubRegIdx = RISCV::sub_vrm2_0;
-VVOpc = RISCV::PseudoVMV_V_V_M2;
-VIOpc = RISCV::PseudoVMV_V_I_M2;
-break;
-  case RISCVII::LMUL_4:
-Opc = RISCV::VMV4R_V;
-SubRegIdx = RISCV::sub_vrm4_0;
-VVOpc = RISCV::PseudoVMV_V_V_M4;
-VIOpc = RISCV::PseudoVMV_V_I_M4;
-break;
-  case RISCVII::LMUL_8:
-assert(NF == 1);
-Opc = RISCV::VMV8R_V;
-SubRegIdx = RISCV::sub_vrm1_0; // There is no sub_vrm8_0.
-VVOpc = RISCV::PseudoVMV_V_V_M8;
-VIOpc = RISCV::PseudoVMV_V_I_M8;
-break;
-  }
-  return {SubRegIdx, Opc, VVOpc, VIOpc};
+[&](unsigned SrcReg,
+unsigned DstReg) -> std::tuple {
+  unsigned SrcEncoding = TRI->getEncodingValue(SrcReg);
+  unsigned DstEncoding = TRI->getEncodingValue(DstReg);
+  if (!(SrcEncoding & 0b111) && !(DstEncoding & 0b111) && I + 8 <= NumRegs)
+return {8, RISCV::VRM8RegClass, RISCV::VMV8R_V, 
RISCV::PseudoVMV_V_V_M8,
+RISCV::PseudoVMV_V_I_M8};
+  if (!(SrcEncoding & 0b11) && !(DstEncoding & 0b11) && I + 4 <= NumRegs)
+return {4, RISCV::VRM4RegClass, RISCV::VMV4R_V, 
RISCV::PseudoVMV_V_V_M4,
+RISCV::PseudoVMV_V_I_M4};
+  if (!(SrcEncoding & 0b1) && !(DstEncoding & 0b1) && I + 2 <= NumRegs)
+return {2, RISCV::VRM2RegClass, RISCV::VMV2R_V, 
RISCV::PseudoVMV_V_V_M2,
+RISCV::PseudoVMV_V_I_M2};
+  return {1, RISCV::VRRegClass, RISCV::VMV1R_V, RISCV::PseudoVMV_V_V_M1,
+  RISCV::PseudoVMV_V_I_M1};
 };
 
-auto [SubRegIdx, Opc, VVOpc, VIOpc] = GetCopyInfo(LMul, NF);
+auto [NumCopied, RegClass, Opc, VVOpc, VIOpc] = GetCopyInfo(SrcReg, 
DstReg);
 
 MachineBasicBlock::const_iterator DefMBBI;
 if (isConvertibleToVMV_V_V(STI, MBB, MBBI, DefMBBI, LMul)) {
@@ -364,6 +344,20 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock 
&MBB,
   }
 }
 
+for (MCPhysReg Reg : RegClass.getRegisters()) {
+  if (TRI->getEncodingValue(Reg) == TRI->getEncodingValue(SrcReg)) {
+SrcReg = Reg;
+break;
+  }
+}
+
+for (MCPhysReg Reg : RegClass.getRegisters()) {
+  if (TRI->getEncodingValue(Reg) == TRI->getEncodingValue(DstReg)) {
+DstReg = Reg;
+break;
+  }
+}
+
 auto EmitCopy = [&](MCRegister SrcReg, MCRegister DstReg, unsigned Opcode) 
{
   auto MIB = BuildMI(MBB, MBBI, DL, get(Opcode), DstReg);
   bool UseVMV_V_I = RISCV::getRVVMCOpcode(Opcode) == RISCV::VMV_V_I;
@@ -385,13 +379,10 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock 
&MBB,
   }
 };
 
-if (NF == 1) {
-  EmitCopy(SrcReg, DstReg, Opc);
-  return;
-}
-
-EmitCopy(TRI->getSubReg(SrcReg, SubRegIdx + I),
- TR

[llvm-branch-commits] [llvm] [RISCV] Use larger copies when register tuples are aligned (PR #84455)

2024-04-02 Thread Pengcheng Wang via llvm-branch-commits



@@ -212,19 +185,13 @@ body: |
 ; CHECK-NEXT: $v7 = VMV1R_V $v14
 ; CHECK-NEXT: $v8 = VMV1R_V $v15
 ; CHECK-NEXT: $v9 = VMV1R_V $v16
-; CHECK-NEXT: $v4 = VMV1R_V $v10
-; CHECK-NEXT: $v5 = VMV1R_V $v11
-; CHECK-NEXT: $v6 = VMV1R_V $v12
-; CHECK-NEXT: $v7 = VMV1R_V $v13
-; CHECK-NEXT: $v8 = VMV1R_V $v14
-; CHECK-NEXT: $v9 = VMV1R_V $v15
+; CHECK-NEXT: $v4m2 = VMV2R_V $v10m2
+; CHECK-NEXT: $v6m2 = VMV2R_V $v12m2
+; CHECK-NEXT: $v8m2 = VMV2R_V $v14m2
 ; CHECK-NEXT: $v10 = VMV1R_V $v16
-; CHECK-NEXT: $v22 = VMV1R_V $v16
-; CHECK-NEXT: $v21 = VMV1R_V $v15
-; CHECK-NEXT: $v20 = VMV1R_V $v14
-; CHECK-NEXT: $v19 = VMV1R_V $v13
-; CHECK-NEXT: $v18 = VMV1R_V $v12
-; CHECK-NEXT: $v17 = VMV1R_V $v11
+; CHECK-NEXT: $v22m2 = VMV2R_V $v16m2

wangpc-pp wrote:

Thanks for catching this! Fixed.
I haved added more tests and checked the result, I think it should be robust 
now.

https://github.com/llvm/llvm-project/pull/84455
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Use larger copies when register tuples are aligned (PR #84455)

2024-04-07 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/84455

>From 35d0ea085b43a67c092e6263e6ec9d34e66e1453 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Tue, 12 Mar 2024 17:31:47 +0800
Subject: [PATCH 1/9] Reduce copies

Created using spr 1.3.4
---
 llvm/lib/Target/RISCV/RISCVInstrInfo.cpp |  89 +-
 llvm/test/CodeGen/RISCV/rvv/vmv-copy.mir |  30 +---
 llvm/test/CodeGen/RISCV/rvv/zvlsseg-copy.mir | 175 +++
 3 files changed, 106 insertions(+), 188 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index 7895e87702c711..9fe5666d6a81f4 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -302,58 +302,38 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock 
&MBB,
RISCVII::VLMUL LMul, unsigned NF) const 
{
   const TargetRegisterInfo *TRI = STI.getRegisterInfo();
 
-  int I = 0, End = NF, Incr = 1;
   unsigned SrcEncoding = TRI->getEncodingValue(SrcReg);
   unsigned DstEncoding = TRI->getEncodingValue(DstReg);
   unsigned LMulVal;
   bool Fractional;
   std::tie(LMulVal, Fractional) = RISCVVType::decodeVLMUL(LMul);
   assert(!Fractional && "It is impossible be fractional lmul here.");
-  if (forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NF * LMulVal)) {
-I = NF - 1;
-End = -1;
-Incr = -1;
-  }
+  unsigned NumRegs = NF * LMulVal;
+  bool ReversedCopy =
+  forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NumRegs);
 
-  for (; I != End; I += Incr) {
+  unsigned I = 0;
+  while (I != NumRegs) {
 auto GetCopyInfo =
-[](RISCVII::VLMUL LMul,unsigned NF) -> std::tuple {
-  unsigned Opc;
-  unsigned SubRegIdx;
-  unsigned VVOpc, VIOpc;
-  switch (LMul) {
-  default:
-llvm_unreachable("Impossible LMUL for vector register copy.");
-  case RISCVII::LMUL_1:
-Opc = RISCV::VMV1R_V;
-SubRegIdx = RISCV::sub_vrm1_0;
-VVOpc = RISCV::PseudoVMV_V_V_M1;
-VIOpc = RISCV::PseudoVMV_V_I_M1;
-break;
-  case RISCVII::LMUL_2:
-Opc = RISCV::VMV2R_V;
-SubRegIdx = RISCV::sub_vrm2_0;
-VVOpc = RISCV::PseudoVMV_V_V_M2;
-VIOpc = RISCV::PseudoVMV_V_I_M2;
-break;
-  case RISCVII::LMUL_4:
-Opc = RISCV::VMV4R_V;
-SubRegIdx = RISCV::sub_vrm4_0;
-VVOpc = RISCV::PseudoVMV_V_V_M4;
-VIOpc = RISCV::PseudoVMV_V_I_M4;
-break;
-  case RISCVII::LMUL_8:
-assert(NF == 1);
-Opc = RISCV::VMV8R_V;
-SubRegIdx = RISCV::sub_vrm1_0; // There is no sub_vrm8_0.
-VVOpc = RISCV::PseudoVMV_V_V_M8;
-VIOpc = RISCV::PseudoVMV_V_I_M8;
-break;
-  }
-  return {SubRegIdx, Opc, VVOpc, VIOpc};
+[&](unsigned SrcReg,
+unsigned DstReg) -> std::tuple {
+  unsigned SrcEncoding = TRI->getEncodingValue(SrcReg);
+  unsigned DstEncoding = TRI->getEncodingValue(DstReg);
+  if (!(SrcEncoding & 0b111) && !(DstEncoding & 0b111) && I + 8 <= NumRegs)
+return {8, RISCV::VRM8RegClass, RISCV::VMV8R_V, 
RISCV::PseudoVMV_V_V_M8,
+RISCV::PseudoVMV_V_I_M8};
+  if (!(SrcEncoding & 0b11) && !(DstEncoding & 0b11) && I + 4 <= NumRegs)
+return {4, RISCV::VRM4RegClass, RISCV::VMV4R_V, 
RISCV::PseudoVMV_V_V_M4,
+RISCV::PseudoVMV_V_I_M4};
+  if (!(SrcEncoding & 0b1) && !(DstEncoding & 0b1) && I + 2 <= NumRegs)
+return {2, RISCV::VRM2RegClass, RISCV::VMV2R_V, 
RISCV::PseudoVMV_V_V_M2,
+RISCV::PseudoVMV_V_I_M2};
+  return {1, RISCV::VRRegClass, RISCV::VMV1R_V, RISCV::PseudoVMV_V_V_M1,
+  RISCV::PseudoVMV_V_I_M1};
 };
 
-auto [SubRegIdx, Opc, VVOpc, VIOpc] = GetCopyInfo(LMul, NF);
+auto [NumCopied, RegClass, Opc, VVOpc, VIOpc] = GetCopyInfo(SrcReg, 
DstReg);
 
 MachineBasicBlock::const_iterator DefMBBI;
 if (isConvertibleToVMV_V_V(STI, MBB, MBBI, DefMBBI, LMul)) {
@@ -364,6 +344,20 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock 
&MBB,
   }
 }
 
+for (MCPhysReg Reg : RegClass.getRegisters()) {
+  if (TRI->getEncodingValue(Reg) == TRI->getEncodingValue(SrcReg)) {
+SrcReg = Reg;
+break;
+  }
+}
+
+for (MCPhysReg Reg : RegClass.getRegisters()) {
+  if (TRI->getEncodingValue(Reg) == TRI->getEncodingValue(DstReg)) {
+DstReg = Reg;
+break;
+  }
+}
+
 auto EmitCopy = [&](MCRegister SrcReg, MCRegister DstReg, unsigned Opcode) 
{
   auto MIB = BuildMI(MBB, MBBI, DL, get(Opcode), DstReg);
   bool UseVMV_V_I = RISCV::getRVVMCOpcode(Opcode) == RISCV::VMV_V_I;
@@ -385,13 +379,10 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock 
&MBB,
   }
 };
 
-if (NF == 1) {
-  EmitCopy(SrcReg, DstReg, Opc);
-  return;
-}
-
-EmitCopy(TRI->getSubReg(SrcReg, SubRegIdx + I),
- TR

[llvm-branch-commits] [llvm] [RISCV] Use larger copies when register tuples are aligned (PR #84455)

2024-04-07 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/84455

>From 35d0ea085b43a67c092e6263e6ec9d34e66e1453 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Tue, 12 Mar 2024 17:31:47 +0800
Subject: [PATCH 1/9] Reduce copies

Created using spr 1.3.4
---
 llvm/lib/Target/RISCV/RISCVInstrInfo.cpp |  89 +-
 llvm/test/CodeGen/RISCV/rvv/vmv-copy.mir |  30 +---
 llvm/test/CodeGen/RISCV/rvv/zvlsseg-copy.mir | 175 +++
 3 files changed, 106 insertions(+), 188 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index 7895e87702c711..9fe5666d6a81f4 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -302,58 +302,38 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock 
&MBB,
RISCVII::VLMUL LMul, unsigned NF) const 
{
   const TargetRegisterInfo *TRI = STI.getRegisterInfo();
 
-  int I = 0, End = NF, Incr = 1;
   unsigned SrcEncoding = TRI->getEncodingValue(SrcReg);
   unsigned DstEncoding = TRI->getEncodingValue(DstReg);
   unsigned LMulVal;
   bool Fractional;
   std::tie(LMulVal, Fractional) = RISCVVType::decodeVLMUL(LMul);
   assert(!Fractional && "It is impossible be fractional lmul here.");
-  if (forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NF * LMulVal)) {
-I = NF - 1;
-End = -1;
-Incr = -1;
-  }
+  unsigned NumRegs = NF * LMulVal;
+  bool ReversedCopy =
+  forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NumRegs);
 
-  for (; I != End; I += Incr) {
+  unsigned I = 0;
+  while (I != NumRegs) {
 auto GetCopyInfo =
-[](RISCVII::VLMUL LMul,unsigned NF) -> std::tuple {
-  unsigned Opc;
-  unsigned SubRegIdx;
-  unsigned VVOpc, VIOpc;
-  switch (LMul) {
-  default:
-llvm_unreachable("Impossible LMUL for vector register copy.");
-  case RISCVII::LMUL_1:
-Opc = RISCV::VMV1R_V;
-SubRegIdx = RISCV::sub_vrm1_0;
-VVOpc = RISCV::PseudoVMV_V_V_M1;
-VIOpc = RISCV::PseudoVMV_V_I_M1;
-break;
-  case RISCVII::LMUL_2:
-Opc = RISCV::VMV2R_V;
-SubRegIdx = RISCV::sub_vrm2_0;
-VVOpc = RISCV::PseudoVMV_V_V_M2;
-VIOpc = RISCV::PseudoVMV_V_I_M2;
-break;
-  case RISCVII::LMUL_4:
-Opc = RISCV::VMV4R_V;
-SubRegIdx = RISCV::sub_vrm4_0;
-VVOpc = RISCV::PseudoVMV_V_V_M4;
-VIOpc = RISCV::PseudoVMV_V_I_M4;
-break;
-  case RISCVII::LMUL_8:
-assert(NF == 1);
-Opc = RISCV::VMV8R_V;
-SubRegIdx = RISCV::sub_vrm1_0; // There is no sub_vrm8_0.
-VVOpc = RISCV::PseudoVMV_V_V_M8;
-VIOpc = RISCV::PseudoVMV_V_I_M8;
-break;
-  }
-  return {SubRegIdx, Opc, VVOpc, VIOpc};
+[&](unsigned SrcReg,
+unsigned DstReg) -> std::tuple {
+  unsigned SrcEncoding = TRI->getEncodingValue(SrcReg);
+  unsigned DstEncoding = TRI->getEncodingValue(DstReg);
+  if (!(SrcEncoding & 0b111) && !(DstEncoding & 0b111) && I + 8 <= NumRegs)
+return {8, RISCV::VRM8RegClass, RISCV::VMV8R_V, 
RISCV::PseudoVMV_V_V_M8,
+RISCV::PseudoVMV_V_I_M8};
+  if (!(SrcEncoding & 0b11) && !(DstEncoding & 0b11) && I + 4 <= NumRegs)
+return {4, RISCV::VRM4RegClass, RISCV::VMV4R_V, 
RISCV::PseudoVMV_V_V_M4,
+RISCV::PseudoVMV_V_I_M4};
+  if (!(SrcEncoding & 0b1) && !(DstEncoding & 0b1) && I + 2 <= NumRegs)
+return {2, RISCV::VRM2RegClass, RISCV::VMV2R_V, 
RISCV::PseudoVMV_V_V_M2,
+RISCV::PseudoVMV_V_I_M2};
+  return {1, RISCV::VRRegClass, RISCV::VMV1R_V, RISCV::PseudoVMV_V_V_M1,
+  RISCV::PseudoVMV_V_I_M1};
 };
 
-auto [SubRegIdx, Opc, VVOpc, VIOpc] = GetCopyInfo(LMul, NF);
+auto [NumCopied, RegClass, Opc, VVOpc, VIOpc] = GetCopyInfo(SrcReg, 
DstReg);
 
 MachineBasicBlock::const_iterator DefMBBI;
 if (isConvertibleToVMV_V_V(STI, MBB, MBBI, DefMBBI, LMul)) {
@@ -364,6 +344,20 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock 
&MBB,
   }
 }
 
+for (MCPhysReg Reg : RegClass.getRegisters()) {
+  if (TRI->getEncodingValue(Reg) == TRI->getEncodingValue(SrcReg)) {
+SrcReg = Reg;
+break;
+  }
+}
+
+for (MCPhysReg Reg : RegClass.getRegisters()) {
+  if (TRI->getEncodingValue(Reg) == TRI->getEncodingValue(DstReg)) {
+DstReg = Reg;
+break;
+  }
+}
+
 auto EmitCopy = [&](MCRegister SrcReg, MCRegister DstReg, unsigned Opcode) 
{
   auto MIB = BuildMI(MBB, MBBI, DL, get(Opcode), DstReg);
   bool UseVMV_V_I = RISCV::getRVVMCOpcode(Opcode) == RISCV::VMV_V_I;
@@ -385,13 +379,10 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock 
&MBB,
   }
 };
 
-if (NF == 1) {
-  EmitCopy(SrcReg, DstReg, Opc);
-  return;
-}
-
-EmitCopy(TRI->getSubReg(SrcReg, SubRegIdx + I),
- TR

[llvm-branch-commits] [llvm] [RISCV] Store VLMul/NF into RegisterClass's TSFlags (PR #84894)

2024-04-07 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/84894

>From 951478b16d8aa834bff4494dc6d05c5f1175d59f Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Tue, 12 Mar 2024 18:41:50 +0800
Subject: [PATCH 1/2] Fix wrong arguments

Created using spr 1.3.4
---
 llvm/lib/Target/RISCV/RISCVInstrInfo.cpp | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index 3e52583ec8ad82..1b3e6cf10189c5 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -495,10 +495,7 @@ void RISCVInstrInfo::copyPhysReg(MachineBasicBlock &MBB,
 RISCV::VRN4M1RegClass, RISCV::VRN4M2RegClass, RISCV::VRN5M1RegClass,
 RISCV::VRN6M1RegClass, RISCV::VRN7M1RegClass, RISCV::VRN8M1RegClass}) {
 if (RegClass.contains(DstReg, SrcReg)) {
-  copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc,
-getLMul(RegClass.TSFlags),
-/*NF=*/
-getNF(RegClass.TSFlags));
+  copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, RegClass);
   return;
 }
   }

>From 9f649a2ceabb7d6a8154c68b4b58b0278b606512 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Mon, 25 Mar 2024 16:50:58 +0800
Subject: [PATCH 2/2] clear includes

Created using spr 1.3.6-beta.1
---
 llvm/lib/Target/RISCV/RISCVInstrInfo.cpp | 1 -
 1 file changed, 1 deletion(-)

diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index 27a58460b1ba9c..d28e4e39eadcbc 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -29,7 +29,6 @@
 #include "llvm/CodeGen/MachineTraceMetrics.h"
 #include "llvm/CodeGen/RegisterScavenging.h"
 #include "llvm/CodeGen/StackMaps.h"
-#include "llvm/CodeGen/TargetRegisterInfo.h"
 #include "llvm/IR/DebugInfoMetadata.h"
 #include "llvm/MC/MCInstBuilder.h"
 #include "llvm/MC/TargetRegistry.h"

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Store VLMul/NF into RegisterClass's TSFlags (PR #84894)

2024-04-07 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/84894

>From 951478b16d8aa834bff4494dc6d05c5f1175d59f Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Tue, 12 Mar 2024 18:41:50 +0800
Subject: [PATCH 1/2] Fix wrong arguments

Created using spr 1.3.4
---
 llvm/lib/Target/RISCV/RISCVInstrInfo.cpp | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index 3e52583ec8ad82..1b3e6cf10189c5 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -495,10 +495,7 @@ void RISCVInstrInfo::copyPhysReg(MachineBasicBlock &MBB,
 RISCV::VRN4M1RegClass, RISCV::VRN4M2RegClass, RISCV::VRN5M1RegClass,
 RISCV::VRN6M1RegClass, RISCV::VRN7M1RegClass, RISCV::VRN8M1RegClass}) {
 if (RegClass.contains(DstReg, SrcReg)) {
-  copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc,
-getLMul(RegClass.TSFlags),
-/*NF=*/
-getNF(RegClass.TSFlags));
+  copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, RegClass);
   return;
 }
   }

>From 9f649a2ceabb7d6a8154c68b4b58b0278b606512 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Mon, 25 Mar 2024 16:50:58 +0800
Subject: [PATCH 2/2] clear includes

Created using spr 1.3.6-beta.1
---
 llvm/lib/Target/RISCV/RISCVInstrInfo.cpp | 1 -
 1 file changed, 1 deletion(-)

diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index 27a58460b1ba9c..d28e4e39eadcbc 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -29,7 +29,6 @@
 #include "llvm/CodeGen/MachineTraceMetrics.h"
 #include "llvm/CodeGen/RegisterScavenging.h"
 #include "llvm/CodeGen/StackMaps.h"
-#include "llvm/CodeGen/TargetRegisterInfo.h"
 #include "llvm/IR/DebugInfoMetadata.h"
 #include "llvm/MC/MCInstBuilder.h"
 #include "llvm/MC/TargetRegistry.h"

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Use larger copies when register tuples are aligned (PR #84455)

2024-04-07 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/84455

>From 35d0ea085b43a67c092e6263e6ec9d34e66e1453 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Tue, 12 Mar 2024 17:31:47 +0800
Subject: [PATCH 01/10] Reduce copies

Created using spr 1.3.4
---
 llvm/lib/Target/RISCV/RISCVInstrInfo.cpp |  89 +-
 llvm/test/CodeGen/RISCV/rvv/vmv-copy.mir |  30 +---
 llvm/test/CodeGen/RISCV/rvv/zvlsseg-copy.mir | 175 +++
 3 files changed, 106 insertions(+), 188 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index 7895e87702c711..9fe5666d6a81f4 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -302,58 +302,38 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock 
&MBB,
RISCVII::VLMUL LMul, unsigned NF) const 
{
   const TargetRegisterInfo *TRI = STI.getRegisterInfo();
 
-  int I = 0, End = NF, Incr = 1;
   unsigned SrcEncoding = TRI->getEncodingValue(SrcReg);
   unsigned DstEncoding = TRI->getEncodingValue(DstReg);
   unsigned LMulVal;
   bool Fractional;
   std::tie(LMulVal, Fractional) = RISCVVType::decodeVLMUL(LMul);
   assert(!Fractional && "It is impossible be fractional lmul here.");
-  if (forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NF * LMulVal)) {
-I = NF - 1;
-End = -1;
-Incr = -1;
-  }
+  unsigned NumRegs = NF * LMulVal;
+  bool ReversedCopy =
+  forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NumRegs);
 
-  for (; I != End; I += Incr) {
+  unsigned I = 0;
+  while (I != NumRegs) {
 auto GetCopyInfo =
-[](RISCVII::VLMUL LMul,unsigned NF) -> std::tuple {
-  unsigned Opc;
-  unsigned SubRegIdx;
-  unsigned VVOpc, VIOpc;
-  switch (LMul) {
-  default:
-llvm_unreachable("Impossible LMUL for vector register copy.");
-  case RISCVII::LMUL_1:
-Opc = RISCV::VMV1R_V;
-SubRegIdx = RISCV::sub_vrm1_0;
-VVOpc = RISCV::PseudoVMV_V_V_M1;
-VIOpc = RISCV::PseudoVMV_V_I_M1;
-break;
-  case RISCVII::LMUL_2:
-Opc = RISCV::VMV2R_V;
-SubRegIdx = RISCV::sub_vrm2_0;
-VVOpc = RISCV::PseudoVMV_V_V_M2;
-VIOpc = RISCV::PseudoVMV_V_I_M2;
-break;
-  case RISCVII::LMUL_4:
-Opc = RISCV::VMV4R_V;
-SubRegIdx = RISCV::sub_vrm4_0;
-VVOpc = RISCV::PseudoVMV_V_V_M4;
-VIOpc = RISCV::PseudoVMV_V_I_M4;
-break;
-  case RISCVII::LMUL_8:
-assert(NF == 1);
-Opc = RISCV::VMV8R_V;
-SubRegIdx = RISCV::sub_vrm1_0; // There is no sub_vrm8_0.
-VVOpc = RISCV::PseudoVMV_V_V_M8;
-VIOpc = RISCV::PseudoVMV_V_I_M8;
-break;
-  }
-  return {SubRegIdx, Opc, VVOpc, VIOpc};
+[&](unsigned SrcReg,
+unsigned DstReg) -> std::tuple {
+  unsigned SrcEncoding = TRI->getEncodingValue(SrcReg);
+  unsigned DstEncoding = TRI->getEncodingValue(DstReg);
+  if (!(SrcEncoding & 0b111) && !(DstEncoding & 0b111) && I + 8 <= NumRegs)
+return {8, RISCV::VRM8RegClass, RISCV::VMV8R_V, 
RISCV::PseudoVMV_V_V_M8,
+RISCV::PseudoVMV_V_I_M8};
+  if (!(SrcEncoding & 0b11) && !(DstEncoding & 0b11) && I + 4 <= NumRegs)
+return {4, RISCV::VRM4RegClass, RISCV::VMV4R_V, 
RISCV::PseudoVMV_V_V_M4,
+RISCV::PseudoVMV_V_I_M4};
+  if (!(SrcEncoding & 0b1) && !(DstEncoding & 0b1) && I + 2 <= NumRegs)
+return {2, RISCV::VRM2RegClass, RISCV::VMV2R_V, 
RISCV::PseudoVMV_V_V_M2,
+RISCV::PseudoVMV_V_I_M2};
+  return {1, RISCV::VRRegClass, RISCV::VMV1R_V, RISCV::PseudoVMV_V_V_M1,
+  RISCV::PseudoVMV_V_I_M1};
 };
 
-auto [SubRegIdx, Opc, VVOpc, VIOpc] = GetCopyInfo(LMul, NF);
+auto [NumCopied, RegClass, Opc, VVOpc, VIOpc] = GetCopyInfo(SrcReg, 
DstReg);
 
 MachineBasicBlock::const_iterator DefMBBI;
 if (isConvertibleToVMV_V_V(STI, MBB, MBBI, DefMBBI, LMul)) {
@@ -364,6 +344,20 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock 
&MBB,
   }
 }
 
+for (MCPhysReg Reg : RegClass.getRegisters()) {
+  if (TRI->getEncodingValue(Reg) == TRI->getEncodingValue(SrcReg)) {
+SrcReg = Reg;
+break;
+  }
+}
+
+for (MCPhysReg Reg : RegClass.getRegisters()) {
+  if (TRI->getEncodingValue(Reg) == TRI->getEncodingValue(DstReg)) {
+DstReg = Reg;
+break;
+  }
+}
+
 auto EmitCopy = [&](MCRegister SrcReg, MCRegister DstReg, unsigned Opcode) 
{
   auto MIB = BuildMI(MBB, MBBI, DL, get(Opcode), DstReg);
   bool UseVMV_V_I = RISCV::getRVVMCOpcode(Opcode) == RISCV::VMV_V_I;
@@ -385,13 +379,10 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock 
&MBB,
   }
 };
 
-if (NF == 1) {
-  EmitCopy(SrcReg, DstReg, Opc);
-  return;
-}
-
-EmitCopy(TRI->getSubReg(SrcReg, SubRegIdx + I),
-

[llvm-branch-commits] [llvm] [RISCV] Store VLMul/NF into RegisterClass's TSFlags (PR #84894)

2024-04-07 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/84894

>From 951478b16d8aa834bff4494dc6d05c5f1175d59f Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Tue, 12 Mar 2024 18:41:50 +0800
Subject: [PATCH 1/2] Fix wrong arguments

Created using spr 1.3.4
---
 llvm/lib/Target/RISCV/RISCVInstrInfo.cpp | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index 3e52583ec8ad82..1b3e6cf10189c5 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -495,10 +495,7 @@ void RISCVInstrInfo::copyPhysReg(MachineBasicBlock &MBB,
 RISCV::VRN4M1RegClass, RISCV::VRN4M2RegClass, RISCV::VRN5M1RegClass,
 RISCV::VRN6M1RegClass, RISCV::VRN7M1RegClass, RISCV::VRN8M1RegClass}) {
 if (RegClass.contains(DstReg, SrcReg)) {
-  copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc,
-getLMul(RegClass.TSFlags),
-/*NF=*/
-getNF(RegClass.TSFlags));
+  copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, RegClass);
   return;
 }
   }

>From 9f649a2ceabb7d6a8154c68b4b58b0278b606512 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Mon, 25 Mar 2024 16:50:58 +0800
Subject: [PATCH 2/2] clear includes

Created using spr 1.3.6-beta.1
---
 llvm/lib/Target/RISCV/RISCVInstrInfo.cpp | 1 -
 1 file changed, 1 deletion(-)

diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index 27a58460b1ba9c..d28e4e39eadcbc 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -29,7 +29,6 @@
 #include "llvm/CodeGen/MachineTraceMetrics.h"
 #include "llvm/CodeGen/RegisterScavenging.h"
 #include "llvm/CodeGen/StackMaps.h"
-#include "llvm/CodeGen/TargetRegisterInfo.h"
 #include "llvm/IR/DebugInfoMetadata.h"
 #include "llvm/MC/MCInstBuilder.h"
 #include "llvm/MC/TargetRegistry.h"

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Store VLMul/NF into RegisterClass's TSFlags (PR #84894)

2024-04-07 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/84894

>From 951478b16d8aa834bff4494dc6d05c5f1175d59f Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Tue, 12 Mar 2024 18:41:50 +0800
Subject: [PATCH 1/2] Fix wrong arguments

Created using spr 1.3.4
---
 llvm/lib/Target/RISCV/RISCVInstrInfo.cpp | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index 3e52583ec8ad82..1b3e6cf10189c5 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -495,10 +495,7 @@ void RISCVInstrInfo::copyPhysReg(MachineBasicBlock &MBB,
 RISCV::VRN4M1RegClass, RISCV::VRN4M2RegClass, RISCV::VRN5M1RegClass,
 RISCV::VRN6M1RegClass, RISCV::VRN7M1RegClass, RISCV::VRN8M1RegClass}) {
 if (RegClass.contains(DstReg, SrcReg)) {
-  copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc,
-getLMul(RegClass.TSFlags),
-/*NF=*/
-getNF(RegClass.TSFlags));
+  copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, RegClass);
   return;
 }
   }

>From 9f649a2ceabb7d6a8154c68b4b58b0278b606512 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Mon, 25 Mar 2024 16:50:58 +0800
Subject: [PATCH 2/2] clear includes

Created using spr 1.3.6-beta.1
---
 llvm/lib/Target/RISCV/RISCVInstrInfo.cpp | 1 -
 1 file changed, 1 deletion(-)

diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index 27a58460b1ba9c..d28e4e39eadcbc 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -29,7 +29,6 @@
 #include "llvm/CodeGen/MachineTraceMetrics.h"
 #include "llvm/CodeGen/RegisterScavenging.h"
 #include "llvm/CodeGen/StackMaps.h"
-#include "llvm/CodeGen/TargetRegisterInfo.h"
 #include "llvm/IR/DebugInfoMetadata.h"
 #include "llvm/MC/MCInstBuilder.h"
 #include "llvm/MC/TargetRegistry.h"

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [compiler-rt] release/18.x: [RISCV] Support rv{32, 64}e in the compiler builtins (#88252) (PR #88525)

2024-04-16 Thread Pengcheng Wang via llvm-branch-commits


wangpc-pp wrote:

> Hi @wangpc-pp (or anyone else). If you would like to add a note about this 
> fix in the release notes (completely optional). Please reply to this comment 
> with a one or two sentence description of the fix.
Yeah, the description can be:
```
Save/restore routines for RV32E/RV64E are added to compiler-rt.
```

https://github.com/llvm/llvm-project/pull/88525
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Don't use V0 directly in patterns (PR #88496)

2024-04-17 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp edited 
https://github.com/llvm/llvm-project/pull/88496
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Don't use V0 directly in patterns (PR #88496)

2024-04-17 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp edited 
https://github.com/llvm/llvm-project/pull/88496
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [RISCV] Remove hasSideEffects=1 for saturating/fault-only-first instructions (PR #90049)

2024-04-25 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp created 
https://github.com/llvm/llvm-project/pull/90049

Masking them as `hasSideEffects=1` stops some optimizations.

For saturating instructions, they may write `vxsat`. This is like
floating-point instructions that may write `fflags`, but we don't
model floating-point instructions as `hasSideEffects=1`.

For fault-only-first instructions, I think we have made its side
effect `vl` an output operand and added explict def of `VL`.

These changes make optimizations like `performCombineVMergeAndVOps`
and MachineCSE possible for these instructions.

As a consequence, `copyprop.mir` can't test what we want to test in
https://reviews.llvm.org/D155140, so we replace `vssra.vi` with a
VCIX instruction (it has side effects).



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [RISCV] Make fixed-point instructions commutable (PR #90053)

2024-04-25 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp created 
https://github.com/llvm/llvm-project/pull/90053

This PR includes:
* vsadd.vv/vsaddu.vv
* vaadd.vv/vaaddu.vv
* vsmul.vv



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [RISCV] Make fixed-point instructions commutable (PR #90053)

2024-04-25 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp closed 
https://github.com/llvm/llvm-project/pull/90053
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [RISCV] Make fixed-point instructions commutable (PR #90053)

2024-04-25 Thread Pengcheng Wang via llvm-branch-commits


wangpc-pp wrote:

Sorry for bothering, I just ran spr on a non-spr branch.

https://github.com/llvm/llvm-project/pull/90053
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [RISCV] Remove hasSideEffects=1 for saturating/fault-only-first instructions (PR #90049)

2024-04-25 Thread Pengcheng Wang via llvm-branch-commits


wangpc-pp wrote:

> > For saturating instructions, they may write vxsat. This is like
> > floating-point instructions that may write fflags, but we don't
> > model floating-point instructions as hasSideEffects=1.
> 
> That's because floating point instructions use mayRaiseFPExceptions=1. And 
> STRICT_* nodes set don't set the NoFPExcept bit in MIFlags. Though we don't 
> have a story for how to make reading FFLAGS work with riscv.* intrinsics. 
> That's an issue on all targets as there is no "constrained" or "strict" 
> support for target specific intrinsics.

Thanks! I forgot about `mayRaiseFPExceptions.

I don't know if I understand correctly, if we have defined explicit `Def` list, 
does it mean that we have modelled it and there is no unmodelled side effect?

https://github.com/llvm/llvm-project/pull/90049
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [RISCV] Remove hasSideEffects=1 for saturating/fault-only-first instructions (PR #90049)

2024-04-25 Thread Pengcheng Wang via llvm-branch-commits


wangpc-pp wrote:

According to `Target.td`:
```c
// Does the instruction have side effects that are not captured by any
// operands of the instruction or other flags?
bit hasSideEffects = ?;
```
It seems we don't need to set `hasSideEffects` for vleNff since we have 
modelled `vl` as an output operand.
As for saturating instructions, I used to think that explicit `Def`/`Use` list 
is kind of side effects `captured by any operands of the instruction` (IIUC), 
so we don't need to set `hasSideEffects` either. And I have just investigated 
AArch64's implementation, they don't set this flag and don't add `Def` list 
(for example, [UQADD - Unsigned saturating 
Add](https://developer.arm.com/documentation/ddi0596/2021-12/SIMD-FP-Instructions/UQADD--Unsigned-saturating-Add-)):
https://github.com/llvm/llvm-project/blob/2de0bedfebb77a6c8a5b0d00902f796fa4022fd6/llvm/lib/Target/AArch64/AArch64InstrInfo.td#L5348
https://github.com/llvm/llvm-project/blob/2de0bedfebb77a6c8a5b0d00902f796fa4022fd6/llvm/lib/Target/AArch64/AArch64InstrFormats.td#L7317-L7336



https://github.com/llvm/llvm-project/pull/90049
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [llvm] [RISCV] Add subtarget features for profiles (PR #84877)

2024-04-26 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp edited 
https://github.com/llvm/llvm-project/pull/84877
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [llvm] [RISCV] Add subtarget features for profiles (PR #84877)

2024-04-26 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/84877

>From ec68548a470d6d9032a900a725e95b92691657b2 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Tue, 12 Mar 2024 14:28:09 +0800
Subject: [PATCH 1/3] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20in?=
 =?UTF-8?q?itial=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.4
---
 clang/test/Driver/riscv-cpus.c| 319 ++
 clang/test/Misc/target-invalid-cpu-note.c |   8 +-
 llvm/lib/Target/RISCV/RISCVProcessors.td  | 224 ++-
 3 files changed, 539 insertions(+), 12 deletions(-)

diff --git a/clang/test/Driver/riscv-cpus.c b/clang/test/Driver/riscv-cpus.c
index ff2bd6f7c8ba34..a285f0f9c41f54 100644
--- a/clang/test/Driver/riscv-cpus.c
+++ b/clang/test/Driver/riscv-cpus.c
@@ -302,3 +302,322 @@
 
 // RUN: not %clang --target=riscv32 -### -c %s 2>&1 -mcpu=generic-rv32 
-march=rv64i | FileCheck -check-prefix=MISMATCH-ARCH %s
 // MISMATCH-ARCH: cpu 'generic-rv32' does not support rv64
+
+// Check profile CPUs
+
+// RUN: %clang -target riscv32 -### -c %s 2>&1 -mcpu=generic-rvi20u32 | 
FileCheck -check-prefix=MCPU-GENERIC-RVI20U32 %s
+// MCPU-GENERIC-RVI20U32: "-target-cpu" "generic-rvi20u32"
+// MCPU-GENERIC-RVI20U32-SAME: "-target-feature" "-a"
+// MCPU-GENERIC-RVI20U32-SAME: "-target-feature" "-c"
+// MCPU-GENERIC-RVI20U32-SAME: "-target-feature" "-d"
+// MCPU-GENERIC-RVI20U32-SAME: "-target-feature" "-f"
+// MCPU-GENERIC-RVI20U32-SAME: "-target-feature" "-m"
+// MCPU-GENERIC-RVI20U32-SAME: "-target-abi" "ilp32"
+
+// RUN: %clang -target riscv64 -### -c %s 2>&1 -mcpu=generic-rvi20u64 | 
FileCheck -check-prefix=MCPU-GENERIC-RVI20U64 %s
+// MCPU-GENERIC-RVI20U64: "-target-cpu" "generic-rvi20u64"
+// MCPU-GENERIC-RVI20U64: "-target-feature" "-a"
+// MCPU-GENERIC-RVI20U64: "-target-feature" "-c"
+// MCPU-GENERIC-RVI20U64: "-target-feature" "-d"
+// MCPU-GENERIC-RVI20U64: "-target-feature" "-f"
+// MCPU-GENERIC-RVI20U64: "-target-feature" "-m"
+// MCPU-GENERIC-RVI20U64-SAME: "-target-abi" "lp64"
+
+// RUN: %clang -target riscv64 -### -c %s 2>&1 -mcpu=generic-rva20u64 | 
FileCheck -check-prefix=MCPU-GENERIC-RVA20U64 %s
+// MCPU-GENERIC-RVA20U64: "-target-cpu" "generic-rva20u64"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+m"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+a"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+f"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+d"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+c"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+ziccamoa"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+ziccif"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+zicclsm"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+ziccrse"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+zicntr"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+zicsr"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+za128rs"
+// MCPU-GENERIC-RVA20U64-SAME: "-target-abi" "lp64d"
+
+// RUN: %clang -target riscv64 -### -c %s 2>&1 -mcpu=generic-rva20s64 | 
FileCheck -check-prefix=MCPU-GENERIC-RVA20S64 %s
+// MCPU-GENERIC-RVA20S64: "-target-cpu" "generic-rva20s64"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+m"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+a"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+f"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+d"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+c"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+ziccamoa"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+ziccif"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+zicclsm"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+ziccrse"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+zicntr"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+zicsr"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+zifencei"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+za128rs"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+ssccptr"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+sstvala"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+sstvecd"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+svade"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+svbare"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-abi" "lp64d"
+
+// RUN: %clang -target riscv64 -### -c %s 2>&1 -mcpu=generic-rva22u64 | 
FileCheck -check-prefix=MCPU-GENERIC-RVA22U64 %s
+// MCPU-GENERIC-RVA22U64: "-target-cpu" "generic-rva22u64"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+m"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+a"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+f"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+d"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+c"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+zic64b"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+zicbom"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+zicbop"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+zicb

[llvm-branch-commits] [RISCV] Generate profiles from RISCVProfiles.td (PR #90187)

2024-04-26 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp created 
https://github.com/llvm/llvm-project/pull/90187

So we can only mantain one place.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [llvm] [RISCV] Add subtarget features for profiles (PR #84877)

2024-04-26 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/84877

>From ec68548a470d6d9032a900a725e95b92691657b2 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Tue, 12 Mar 2024 14:28:09 +0800
Subject: [PATCH 1/4] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20in?=
 =?UTF-8?q?itial=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.4
---
 clang/test/Driver/riscv-cpus.c| 319 ++
 clang/test/Misc/target-invalid-cpu-note.c |   8 +-
 llvm/lib/Target/RISCV/RISCVProcessors.td  | 224 ++-
 3 files changed, 539 insertions(+), 12 deletions(-)

diff --git a/clang/test/Driver/riscv-cpus.c b/clang/test/Driver/riscv-cpus.c
index ff2bd6f7c8ba34..a285f0f9c41f54 100644
--- a/clang/test/Driver/riscv-cpus.c
+++ b/clang/test/Driver/riscv-cpus.c
@@ -302,3 +302,322 @@
 
 // RUN: not %clang --target=riscv32 -### -c %s 2>&1 -mcpu=generic-rv32 
-march=rv64i | FileCheck -check-prefix=MISMATCH-ARCH %s
 // MISMATCH-ARCH: cpu 'generic-rv32' does not support rv64
+
+// Check profile CPUs
+
+// RUN: %clang -target riscv32 -### -c %s 2>&1 -mcpu=generic-rvi20u32 | 
FileCheck -check-prefix=MCPU-GENERIC-RVI20U32 %s
+// MCPU-GENERIC-RVI20U32: "-target-cpu" "generic-rvi20u32"
+// MCPU-GENERIC-RVI20U32-SAME: "-target-feature" "-a"
+// MCPU-GENERIC-RVI20U32-SAME: "-target-feature" "-c"
+// MCPU-GENERIC-RVI20U32-SAME: "-target-feature" "-d"
+// MCPU-GENERIC-RVI20U32-SAME: "-target-feature" "-f"
+// MCPU-GENERIC-RVI20U32-SAME: "-target-feature" "-m"
+// MCPU-GENERIC-RVI20U32-SAME: "-target-abi" "ilp32"
+
+// RUN: %clang -target riscv64 -### -c %s 2>&1 -mcpu=generic-rvi20u64 | 
FileCheck -check-prefix=MCPU-GENERIC-RVI20U64 %s
+// MCPU-GENERIC-RVI20U64: "-target-cpu" "generic-rvi20u64"
+// MCPU-GENERIC-RVI20U64: "-target-feature" "-a"
+// MCPU-GENERIC-RVI20U64: "-target-feature" "-c"
+// MCPU-GENERIC-RVI20U64: "-target-feature" "-d"
+// MCPU-GENERIC-RVI20U64: "-target-feature" "-f"
+// MCPU-GENERIC-RVI20U64: "-target-feature" "-m"
+// MCPU-GENERIC-RVI20U64-SAME: "-target-abi" "lp64"
+
+// RUN: %clang -target riscv64 -### -c %s 2>&1 -mcpu=generic-rva20u64 | 
FileCheck -check-prefix=MCPU-GENERIC-RVA20U64 %s
+// MCPU-GENERIC-RVA20U64: "-target-cpu" "generic-rva20u64"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+m"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+a"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+f"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+d"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+c"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+ziccamoa"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+ziccif"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+zicclsm"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+ziccrse"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+zicntr"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+zicsr"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+za128rs"
+// MCPU-GENERIC-RVA20U64-SAME: "-target-abi" "lp64d"
+
+// RUN: %clang -target riscv64 -### -c %s 2>&1 -mcpu=generic-rva20s64 | 
FileCheck -check-prefix=MCPU-GENERIC-RVA20S64 %s
+// MCPU-GENERIC-RVA20S64: "-target-cpu" "generic-rva20s64"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+m"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+a"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+f"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+d"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+c"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+ziccamoa"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+ziccif"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+zicclsm"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+ziccrse"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+zicntr"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+zicsr"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+zifencei"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+za128rs"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+ssccptr"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+sstvala"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+sstvecd"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+svade"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+svbare"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-abi" "lp64d"
+
+// RUN: %clang -target riscv64 -### -c %s 2>&1 -mcpu=generic-rva22u64 | 
FileCheck -check-prefix=MCPU-GENERIC-RVA22U64 %s
+// MCPU-GENERIC-RVA22U64: "-target-cpu" "generic-rva22u64"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+m"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+a"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+f"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+d"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+c"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+zic64b"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+zicbom"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+zicbop"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+zicb

[llvm-branch-commits] [clang] [llvm] [RISCV] Add subtarget features for profiles (PR #84877)

2024-04-26 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/84877

>From ec68548a470d6d9032a900a725e95b92691657b2 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Tue, 12 Mar 2024 14:28:09 +0800
Subject: [PATCH 1/5] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20in?=
 =?UTF-8?q?itial=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.4
---
 clang/test/Driver/riscv-cpus.c| 319 ++
 clang/test/Misc/target-invalid-cpu-note.c |   8 +-
 llvm/lib/Target/RISCV/RISCVProcessors.td  | 224 ++-
 3 files changed, 539 insertions(+), 12 deletions(-)

diff --git a/clang/test/Driver/riscv-cpus.c b/clang/test/Driver/riscv-cpus.c
index ff2bd6f7c8ba34..a285f0f9c41f54 100644
--- a/clang/test/Driver/riscv-cpus.c
+++ b/clang/test/Driver/riscv-cpus.c
@@ -302,3 +302,322 @@
 
 // RUN: not %clang --target=riscv32 -### -c %s 2>&1 -mcpu=generic-rv32 
-march=rv64i | FileCheck -check-prefix=MISMATCH-ARCH %s
 // MISMATCH-ARCH: cpu 'generic-rv32' does not support rv64
+
+// Check profile CPUs
+
+// RUN: %clang -target riscv32 -### -c %s 2>&1 -mcpu=generic-rvi20u32 | 
FileCheck -check-prefix=MCPU-GENERIC-RVI20U32 %s
+// MCPU-GENERIC-RVI20U32: "-target-cpu" "generic-rvi20u32"
+// MCPU-GENERIC-RVI20U32-SAME: "-target-feature" "-a"
+// MCPU-GENERIC-RVI20U32-SAME: "-target-feature" "-c"
+// MCPU-GENERIC-RVI20U32-SAME: "-target-feature" "-d"
+// MCPU-GENERIC-RVI20U32-SAME: "-target-feature" "-f"
+// MCPU-GENERIC-RVI20U32-SAME: "-target-feature" "-m"
+// MCPU-GENERIC-RVI20U32-SAME: "-target-abi" "ilp32"
+
+// RUN: %clang -target riscv64 -### -c %s 2>&1 -mcpu=generic-rvi20u64 | 
FileCheck -check-prefix=MCPU-GENERIC-RVI20U64 %s
+// MCPU-GENERIC-RVI20U64: "-target-cpu" "generic-rvi20u64"
+// MCPU-GENERIC-RVI20U64: "-target-feature" "-a"
+// MCPU-GENERIC-RVI20U64: "-target-feature" "-c"
+// MCPU-GENERIC-RVI20U64: "-target-feature" "-d"
+// MCPU-GENERIC-RVI20U64: "-target-feature" "-f"
+// MCPU-GENERIC-RVI20U64: "-target-feature" "-m"
+// MCPU-GENERIC-RVI20U64-SAME: "-target-abi" "lp64"
+
+// RUN: %clang -target riscv64 -### -c %s 2>&1 -mcpu=generic-rva20u64 | 
FileCheck -check-prefix=MCPU-GENERIC-RVA20U64 %s
+// MCPU-GENERIC-RVA20U64: "-target-cpu" "generic-rva20u64"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+m"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+a"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+f"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+d"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+c"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+ziccamoa"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+ziccif"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+zicclsm"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+ziccrse"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+zicntr"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+zicsr"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+za128rs"
+// MCPU-GENERIC-RVA20U64-SAME: "-target-abi" "lp64d"
+
+// RUN: %clang -target riscv64 -### -c %s 2>&1 -mcpu=generic-rva20s64 | 
FileCheck -check-prefix=MCPU-GENERIC-RVA20S64 %s
+// MCPU-GENERIC-RVA20S64: "-target-cpu" "generic-rva20s64"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+m"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+a"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+f"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+d"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+c"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+ziccamoa"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+ziccif"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+zicclsm"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+ziccrse"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+zicntr"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+zicsr"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+zifencei"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+za128rs"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+ssccptr"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+sstvala"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+sstvecd"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+svade"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+svbare"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-abi" "lp64d"
+
+// RUN: %clang -target riscv64 -### -c %s 2>&1 -mcpu=generic-rva22u64 | 
FileCheck -check-prefix=MCPU-GENERIC-RVA22U64 %s
+// MCPU-GENERIC-RVA22U64: "-target-cpu" "generic-rva22u64"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+m"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+a"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+f"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+d"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+c"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+zic64b"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+zicbom"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+zicbop"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+zicb

[llvm-branch-commits] [RISCV] Generate profiles from RISCVProfiles.td (PR #90187)

2024-04-26 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/90187


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [RISCV] Generate profiles from RISCVProfiles.td (PR #90187)

2024-04-26 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/90187


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [RISCV] Generate profiles from RISCVProfiles.td (PR #90187)

2024-04-26 Thread Pengcheng Wang via llvm-branch-commits



@@ -51,6 +51,14 @@ def Feature64Bit
 def FeatureDummy
 : SubtargetFeature<"dummy", "Dummy", "true", "Dummy">;
 
+class RISCVProfile features>
+: SubtargetFeature;
+
+def RVI20U32 : RISCVProfile<"rvi20u32", [Feature32Bit, FeatureStdExtI]>;
+def RVI20U64 : RISCVProfile<"rvi20u64", [Feature64Bit, FeatureStdExtI]>;

wangpc-pp wrote:

We don't handle implications in TableGen, it is in `RISCVISAInfo` where we 
parse the march string. But yeah, I added `F` which implies `Zicsr`.

https://github.com/llvm/llvm-project/pull/90187
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [llvm] [RISCV] Add subtarget features for profiles (PR #84877)

2024-04-26 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/84877

>From ec68548a470d6d9032a900a725e95b92691657b2 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Tue, 12 Mar 2024 14:28:09 +0800
Subject: [PATCH 1/6] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20in?=
 =?UTF-8?q?itial=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.4
---
 clang/test/Driver/riscv-cpus.c| 319 ++
 clang/test/Misc/target-invalid-cpu-note.c |   8 +-
 llvm/lib/Target/RISCV/RISCVProcessors.td  | 224 ++-
 3 files changed, 539 insertions(+), 12 deletions(-)

diff --git a/clang/test/Driver/riscv-cpus.c b/clang/test/Driver/riscv-cpus.c
index ff2bd6f7c8ba34..a285f0f9c41f54 100644
--- a/clang/test/Driver/riscv-cpus.c
+++ b/clang/test/Driver/riscv-cpus.c
@@ -302,3 +302,322 @@
 
 // RUN: not %clang --target=riscv32 -### -c %s 2>&1 -mcpu=generic-rv32 
-march=rv64i | FileCheck -check-prefix=MISMATCH-ARCH %s
 // MISMATCH-ARCH: cpu 'generic-rv32' does not support rv64
+
+// Check profile CPUs
+
+// RUN: %clang -target riscv32 -### -c %s 2>&1 -mcpu=generic-rvi20u32 | 
FileCheck -check-prefix=MCPU-GENERIC-RVI20U32 %s
+// MCPU-GENERIC-RVI20U32: "-target-cpu" "generic-rvi20u32"
+// MCPU-GENERIC-RVI20U32-SAME: "-target-feature" "-a"
+// MCPU-GENERIC-RVI20U32-SAME: "-target-feature" "-c"
+// MCPU-GENERIC-RVI20U32-SAME: "-target-feature" "-d"
+// MCPU-GENERIC-RVI20U32-SAME: "-target-feature" "-f"
+// MCPU-GENERIC-RVI20U32-SAME: "-target-feature" "-m"
+// MCPU-GENERIC-RVI20U32-SAME: "-target-abi" "ilp32"
+
+// RUN: %clang -target riscv64 -### -c %s 2>&1 -mcpu=generic-rvi20u64 | 
FileCheck -check-prefix=MCPU-GENERIC-RVI20U64 %s
+// MCPU-GENERIC-RVI20U64: "-target-cpu" "generic-rvi20u64"
+// MCPU-GENERIC-RVI20U64: "-target-feature" "-a"
+// MCPU-GENERIC-RVI20U64: "-target-feature" "-c"
+// MCPU-GENERIC-RVI20U64: "-target-feature" "-d"
+// MCPU-GENERIC-RVI20U64: "-target-feature" "-f"
+// MCPU-GENERIC-RVI20U64: "-target-feature" "-m"
+// MCPU-GENERIC-RVI20U64-SAME: "-target-abi" "lp64"
+
+// RUN: %clang -target riscv64 -### -c %s 2>&1 -mcpu=generic-rva20u64 | 
FileCheck -check-prefix=MCPU-GENERIC-RVA20U64 %s
+// MCPU-GENERIC-RVA20U64: "-target-cpu" "generic-rva20u64"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+m"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+a"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+f"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+d"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+c"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+ziccamoa"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+ziccif"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+zicclsm"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+ziccrse"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+zicntr"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+zicsr"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+za128rs"
+// MCPU-GENERIC-RVA20U64-SAME: "-target-abi" "lp64d"
+
+// RUN: %clang -target riscv64 -### -c %s 2>&1 -mcpu=generic-rva20s64 | 
FileCheck -check-prefix=MCPU-GENERIC-RVA20S64 %s
+// MCPU-GENERIC-RVA20S64: "-target-cpu" "generic-rva20s64"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+m"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+a"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+f"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+d"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+c"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+ziccamoa"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+ziccif"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+zicclsm"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+ziccrse"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+zicntr"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+zicsr"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+zifencei"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+za128rs"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+ssccptr"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+sstvala"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+sstvecd"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+svade"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+svbare"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-abi" "lp64d"
+
+// RUN: %clang -target riscv64 -### -c %s 2>&1 -mcpu=generic-rva22u64 | 
FileCheck -check-prefix=MCPU-GENERIC-RVA22U64 %s
+// MCPU-GENERIC-RVA22U64: "-target-cpu" "generic-rva22u64"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+m"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+a"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+f"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+d"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+c"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+zic64b"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+zicbom"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+zicbop"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+zicb

[llvm-branch-commits] [RISCV] Generate profiles from RISCVProfiles.td (PR #90187)

2024-04-27 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/90187


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [RISCV] Generate profiles from RISCVProfiles.td (PR #90187)

2024-04-27 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/90187


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [llvm] [RISCV] Add subtarget features for profiles (PR #84877)

2024-04-27 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/84877

>From ec68548a470d6d9032a900a725e95b92691657b2 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Tue, 12 Mar 2024 14:28:09 +0800
Subject: [PATCH 1/7] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20in?=
 =?UTF-8?q?itial=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.4
---
 clang/test/Driver/riscv-cpus.c| 319 ++
 clang/test/Misc/target-invalid-cpu-note.c |   8 +-
 llvm/lib/Target/RISCV/RISCVProcessors.td  | 224 ++-
 3 files changed, 539 insertions(+), 12 deletions(-)

diff --git a/clang/test/Driver/riscv-cpus.c b/clang/test/Driver/riscv-cpus.c
index ff2bd6f7c8ba34..a285f0f9c41f54 100644
--- a/clang/test/Driver/riscv-cpus.c
+++ b/clang/test/Driver/riscv-cpus.c
@@ -302,3 +302,322 @@
 
 // RUN: not %clang --target=riscv32 -### -c %s 2>&1 -mcpu=generic-rv32 
-march=rv64i | FileCheck -check-prefix=MISMATCH-ARCH %s
 // MISMATCH-ARCH: cpu 'generic-rv32' does not support rv64
+
+// Check profile CPUs
+
+// RUN: %clang -target riscv32 -### -c %s 2>&1 -mcpu=generic-rvi20u32 | 
FileCheck -check-prefix=MCPU-GENERIC-RVI20U32 %s
+// MCPU-GENERIC-RVI20U32: "-target-cpu" "generic-rvi20u32"
+// MCPU-GENERIC-RVI20U32-SAME: "-target-feature" "-a"
+// MCPU-GENERIC-RVI20U32-SAME: "-target-feature" "-c"
+// MCPU-GENERIC-RVI20U32-SAME: "-target-feature" "-d"
+// MCPU-GENERIC-RVI20U32-SAME: "-target-feature" "-f"
+// MCPU-GENERIC-RVI20U32-SAME: "-target-feature" "-m"
+// MCPU-GENERIC-RVI20U32-SAME: "-target-abi" "ilp32"
+
+// RUN: %clang -target riscv64 -### -c %s 2>&1 -mcpu=generic-rvi20u64 | 
FileCheck -check-prefix=MCPU-GENERIC-RVI20U64 %s
+// MCPU-GENERIC-RVI20U64: "-target-cpu" "generic-rvi20u64"
+// MCPU-GENERIC-RVI20U64: "-target-feature" "-a"
+// MCPU-GENERIC-RVI20U64: "-target-feature" "-c"
+// MCPU-GENERIC-RVI20U64: "-target-feature" "-d"
+// MCPU-GENERIC-RVI20U64: "-target-feature" "-f"
+// MCPU-GENERIC-RVI20U64: "-target-feature" "-m"
+// MCPU-GENERIC-RVI20U64-SAME: "-target-abi" "lp64"
+
+// RUN: %clang -target riscv64 -### -c %s 2>&1 -mcpu=generic-rva20u64 | 
FileCheck -check-prefix=MCPU-GENERIC-RVA20U64 %s
+// MCPU-GENERIC-RVA20U64: "-target-cpu" "generic-rva20u64"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+m"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+a"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+f"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+d"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+c"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+ziccamoa"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+ziccif"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+zicclsm"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+ziccrse"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+zicntr"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+zicsr"
+// MCPU-GENERIC-RVA20U64: "-target-feature" "+za128rs"
+// MCPU-GENERIC-RVA20U64-SAME: "-target-abi" "lp64d"
+
+// RUN: %clang -target riscv64 -### -c %s 2>&1 -mcpu=generic-rva20s64 | 
FileCheck -check-prefix=MCPU-GENERIC-RVA20S64 %s
+// MCPU-GENERIC-RVA20S64: "-target-cpu" "generic-rva20s64"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+m"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+a"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+f"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+d"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+c"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+ziccamoa"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+ziccif"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+zicclsm"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+ziccrse"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+zicntr"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+zicsr"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+zifencei"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+za128rs"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+ssccptr"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+sstvala"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+sstvecd"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+svade"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-feature" "+svbare"
+// MCPU-GENERIC-RVA20S64-SAME: "-target-abi" "lp64d"
+
+// RUN: %clang -target riscv64 -### -c %s 2>&1 -mcpu=generic-rva22u64 | 
FileCheck -check-prefix=MCPU-GENERIC-RVA22U64 %s
+// MCPU-GENERIC-RVA22U64: "-target-cpu" "generic-rva22u64"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+m"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+a"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+f"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+d"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+c"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+zic64b"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+zicbom"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+zicbop"
+// MCPU-GENERIC-RVA22U64-SAME: "-target-feature" "+zicb

[llvm-branch-commits] [RISCV] Generate profiles from RISCVProfiles.td (PR #90187)

2024-04-27 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/90187


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [RISCV] Generate profiles from RISCVProfiles.td (PR #90187)

2024-04-27 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/90187


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [RISCV] Make fixed-point instructions commutable (PR #90372)

2024-04-27 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp created 
https://github.com/llvm/llvm-project/pull/90372

This PR includes:
* vsadd.vv/vsaddu.vv
* vaadd.vv/vaaddu.vv
* vsmul.vv



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [RISCV] Make fixed-point instructions commutable (PR #90372)

2024-04-27 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp closed 
https://github.com/llvm/llvm-project/pull/90372
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [RISCV] Remove hasSideEffects=1 for saturating/fault-only-first instructions (PR #90049)

2024-04-29 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/90049


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [RISCV] Remove hasSideEffects=1 for saturating/fault-only-first instructions (PR #90049)

2024-04-29 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/90049


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [RISCV] Remove hasSideEffects=1 for saturating/fault-only-first instructions (PR #90049)

2024-04-29 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp edited 
https://github.com/llvm/llvm-project/pull/90049
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [AArch64] Remove usage of PostRAScheduler (PR #92871)

2024-05-21 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp created 
https://github.com/llvm/llvm-project/pull/92871

This doesn't take effect as we have overrided `enablePostRAScheduler`
and we should use the `FeaturePostRAScheduler` feature in processor
definitions.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [AArch64] Remove usage of PostRAScheduler (PR #92871)

2024-05-21 Thread Pengcheng Wang via llvm-branch-commits


wangpc-pp wrote:

> Submit your PRs to `main` branch

I used [spr](https://getcord.github.io/spr/) to create this PR, so I think it's 
OK.

https://github.com/llvm/llvm-project/pull/92871
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [AArch64] Remove usage of PostRAScheduler (PR #92871)

2024-05-21 Thread Pengcheng Wang via llvm-branch-commits


wangpc-pp wrote:

> > > Submit your PRs to `main` branch
> > 
> > 
> > I used [spr](https://getcord.github.io/spr/) to create this PR, so I think 
> > it's OK.
> 
> No, your target branch is wrong. Either you should want to merge into `main` 
> or into a branch for another PR created by `spr`. However, in this case it 
> appears you are targeting the same branch as your source.

These are two different branches:
wangpc-pp wants to merge 1 commit into
users/wangpc-pp/spr/main.aarch64-remove-usage-of-postrascheduler
from
users/wangpc-pp/spr/aarch64-remove-usage-of-postrascheduler

https://github.com/llvm/llvm-project/pull/92871
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [AArch64] Remove usage of PostRAScheduler (PR #92871)

2024-05-21 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/92871


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [AArch64] Remove usage of PostRAScheduler (PR #92871)

2024-05-21 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/92871


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [AArch64] Remove usage of PostRAScheduler (PR #92871)

2024-05-21 Thread Pengcheng Wang via llvm-branch-commits


wangpc-pp wrote:

These branches are created by `spr`, which is out of my control, and we can 
merge it into `main` branch via `spr land`. These branches are just magics 
behind `spr`, please don't obsess over this. `spr` is a tool that the community 
suggests, I have used it for a long time and I haven't met any issue.

https://github.com/llvm/llvm-project/pull/92871
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-09-09 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp edited 
https://github.com/llvm/llvm-project/pull/107548
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-09-09 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/107548

>From f21cfcfc90330ee3856746b6315a81a00313b0e0 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Fri, 6 Sep 2024 17:20:51 +0800
Subject: [PATCH 1/2] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20in?=
 =?UTF-8?q?itial=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.6-beta.1
---
 .../Target/RISCV/RISCVTargetTransformInfo.cpp |  15 +
 .../Target/RISCV/RISCVTargetTransformInfo.h   |   3 +
 llvm/test/CodeGen/RISCV/memcmp.ll | 932 ++
 3 files changed, 950 insertions(+)
 create mode 100644 llvm/test/CodeGen/RISCV/memcmp.ll

diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index e809e15eacf696..ad532aadc83266 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -2113,3 +2113,18 @@ bool RISCVTTIImpl::shouldConsiderAddressTypePromotion(
   }
   return Considerable;
 }
+
+RISCVTTIImpl::TTI::MemCmpExpansionOptions
+RISCVTTIImpl::enableMemCmpExpansion(bool OptSize, bool IsZeroCmp) const {
+  TTI::MemCmpExpansionOptions Options;
+  // FIXME: Vector haven't been tested.
+  Options.AllowOverlappingLoads =
+  (ST->enableUnalignedScalarMem() || ST->enableUnalignedScalarMem());
+  Options.MaxNumLoads = TLI->getMaxExpandSizeMemcmp(OptSize);
+  Options.NumLoadsPerBlock = Options.MaxNumLoads;
+  if (ST->is64Bit())
+Options.LoadSizes.push_back(8);
+  llvm::append_range(Options.LoadSizes, ArrayRef({4, 2, 1}));
+  Options.AllowedTailExpansions = {3, 5, 6};
+  return Options;
+}
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h 
b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
index 763b89bfec0a66..ee9bed09df97f3 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
@@ -404,6 +404,9 @@ class RISCVTTIImpl : public BasicTTIImplBase {
   shouldConsiderAddressTypePromotion(const Instruction &I,
  bool &AllowPromotionWithoutCommonHeader);
   std::optional getMinPageSize() const { return 4096; }
+
+  TTI::MemCmpExpansionOptions enableMemCmpExpansion(bool OptSize,
+bool IsZeroCmp) const;
 };
 
 } // end namespace llvm
diff --git a/llvm/test/CodeGen/RISCV/memcmp.ll 
b/llvm/test/CodeGen/RISCV/memcmp.ll
new file mode 100644
index 00..652cd02e2c750a
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/memcmp.ll
@@ -0,0 +1,932 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 5
+; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 -O2 | FileCheck %s 
--check-prefix=CHECK-ALIGNED-RV32
+; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 -O2 | FileCheck %s 
--check-prefix=CHECK-ALIGNED-RV64
+; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 
-mattr=+unaligned-scalar-mem -O2 \
+; RUN:   | FileCheck %s --check-prefix=CHECK-UNALIGNED-RV32
+; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 
-mattr=+unaligned-scalar-mem -O2 \
+; RUN:   | FileCheck %s --check-prefix=CHECK-UNALIGNED-RV64
+
+declare i32 @bcmp(i8*, i8*, iXLen) nounwind readonly
+declare i32 @memcmp(i8*, i8*, iXLen) nounwind readonly
+
+define i1 @bcmp_size_15(i8* %s1, i8* %s2) {
+; CHECK-ALIGNED-RV32-LABEL: bcmp_size_15:
+; CHECK-ALIGNED-RV32:   # %bb.0: # %entry
+; CHECK-ALIGNED-RV32-NEXT:lbu a2, 1(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 0(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 2(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 3(a0)
+; CHECK-ALIGNED-RV32-NEXT:slli a2, a2, 8
+; CHECK-ALIGNED-RV32-NEXT:or a2, a2, a3
+; CHECK-ALIGNED-RV32-NEXT:slli a4, a4, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a5, a4
+; CHECK-ALIGNED-RV32-NEXT:or a2, a4, a2
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 1(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 0(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 2(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a6, 3(a1)
+; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 8
+; CHECK-ALIGNED-RV32-NEXT:or a3, a3, a4
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a6, a5
+; CHECK-ALIGNED-RV32-NEXT:or a3, a4, a3
+; CHECK-ALIGNED-RV32-NEXT:xor a2, a2, a3
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 5(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 4(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 6(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a6, 7(a0)
+; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 8
+; CHECK-ALIGNED-RV32-NEXT:or a3, a3, a4
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a6, a5
+; CHECK-ALIGNED-RV32-NEXT:or a3, a4, a3
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 5(a1)
+; CHECK-ALIGNED-RV32-NEXT

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-09-09 Thread Pengcheng Wang via llvm-branch-commits



@@ -0,0 +1,932 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 5
+; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 -O2 | FileCheck %s 
--check-prefix=CHECK-ALIGNED-RV32
+; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 -O2 | FileCheck %s 
--check-prefix=CHECK-ALIGNED-RV64
+; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 
-mattr=+unaligned-scalar-mem -O2 \
+; RUN:   | FileCheck %s --check-prefix=CHECK-UNALIGNED-RV32
+; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 
-mattr=+unaligned-scalar-mem -O2 \
+; RUN:   | FileCheck %s --check-prefix=CHECK-UNALIGNED-RV64
+
+declare i32 @bcmp(i8*, i8*, iXLen) nounwind readonly

wangpc-pp wrote:

https://github.com/llvm/llvm-project/pull/107824

https://github.com/llvm/llvm-project/pull/107548
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-09-09 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/107548

>From f21cfcfc90330ee3856746b6315a81a00313b0e0 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Fri, 6 Sep 2024 17:20:51 +0800
Subject: [PATCH 1/3] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20in?=
 =?UTF-8?q?itial=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.6-beta.1
---
 .../Target/RISCV/RISCVTargetTransformInfo.cpp |  15 +
 .../Target/RISCV/RISCVTargetTransformInfo.h   |   3 +
 llvm/test/CodeGen/RISCV/memcmp.ll | 932 ++
 3 files changed, 950 insertions(+)
 create mode 100644 llvm/test/CodeGen/RISCV/memcmp.ll

diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index e809e15eacf696..ad532aadc83266 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -2113,3 +2113,18 @@ bool RISCVTTIImpl::shouldConsiderAddressTypePromotion(
   }
   return Considerable;
 }
+
+RISCVTTIImpl::TTI::MemCmpExpansionOptions
+RISCVTTIImpl::enableMemCmpExpansion(bool OptSize, bool IsZeroCmp) const {
+  TTI::MemCmpExpansionOptions Options;
+  // FIXME: Vector haven't been tested.
+  Options.AllowOverlappingLoads =
+  (ST->enableUnalignedScalarMem() || ST->enableUnalignedScalarMem());
+  Options.MaxNumLoads = TLI->getMaxExpandSizeMemcmp(OptSize);
+  Options.NumLoadsPerBlock = Options.MaxNumLoads;
+  if (ST->is64Bit())
+Options.LoadSizes.push_back(8);
+  llvm::append_range(Options.LoadSizes, ArrayRef({4, 2, 1}));
+  Options.AllowedTailExpansions = {3, 5, 6};
+  return Options;
+}
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h 
b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
index 763b89bfec0a66..ee9bed09df97f3 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
@@ -404,6 +404,9 @@ class RISCVTTIImpl : public BasicTTIImplBase {
   shouldConsiderAddressTypePromotion(const Instruction &I,
  bool &AllowPromotionWithoutCommonHeader);
   std::optional getMinPageSize() const { return 4096; }
+
+  TTI::MemCmpExpansionOptions enableMemCmpExpansion(bool OptSize,
+bool IsZeroCmp) const;
 };
 
 } // end namespace llvm
diff --git a/llvm/test/CodeGen/RISCV/memcmp.ll 
b/llvm/test/CodeGen/RISCV/memcmp.ll
new file mode 100644
index 00..652cd02e2c750a
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/memcmp.ll
@@ -0,0 +1,932 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 5
+; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 -O2 | FileCheck %s 
--check-prefix=CHECK-ALIGNED-RV32
+; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 -O2 | FileCheck %s 
--check-prefix=CHECK-ALIGNED-RV64
+; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 
-mattr=+unaligned-scalar-mem -O2 \
+; RUN:   | FileCheck %s --check-prefix=CHECK-UNALIGNED-RV32
+; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 
-mattr=+unaligned-scalar-mem -O2 \
+; RUN:   | FileCheck %s --check-prefix=CHECK-UNALIGNED-RV64
+
+declare i32 @bcmp(i8*, i8*, iXLen) nounwind readonly
+declare i32 @memcmp(i8*, i8*, iXLen) nounwind readonly
+
+define i1 @bcmp_size_15(i8* %s1, i8* %s2) {
+; CHECK-ALIGNED-RV32-LABEL: bcmp_size_15:
+; CHECK-ALIGNED-RV32:   # %bb.0: # %entry
+; CHECK-ALIGNED-RV32-NEXT:lbu a2, 1(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 0(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 2(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 3(a0)
+; CHECK-ALIGNED-RV32-NEXT:slli a2, a2, 8
+; CHECK-ALIGNED-RV32-NEXT:or a2, a2, a3
+; CHECK-ALIGNED-RV32-NEXT:slli a4, a4, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a5, a4
+; CHECK-ALIGNED-RV32-NEXT:or a2, a4, a2
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 1(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 0(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 2(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a6, 3(a1)
+; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 8
+; CHECK-ALIGNED-RV32-NEXT:or a3, a3, a4
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a6, a5
+; CHECK-ALIGNED-RV32-NEXT:or a3, a4, a3
+; CHECK-ALIGNED-RV32-NEXT:xor a2, a2, a3
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 5(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 4(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 6(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a6, 7(a0)
+; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 8
+; CHECK-ALIGNED-RV32-NEXT:or a3, a3, a4
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a6, a5
+; CHECK-ALIGNED-RV32-NEXT:or a3, a4, a3
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 5(a1)
+; CHECK-ALIGNED-RV32-NEXT

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-09-09 Thread Pengcheng Wang via llvm-branch-commits



@@ -2113,3 +2113,18 @@ bool RISCVTTIImpl::shouldConsiderAddressTypePromotion(
   }
   return Considerable;
 }
+
+RISCVTTIImpl::TTI::MemCmpExpansionOptions
+RISCVTTIImpl::enableMemCmpExpansion(bool OptSize, bool IsZeroCmp) const {
+  TTI::MemCmpExpansionOptions Options;
+  // FIXME: Vector haven't been tested.
+  Options.AllowOverlappingLoads =
+  (ST->enableUnalignedScalarMem() || ST->enableUnalignedVectorMem());
+  Options.MaxNumLoads = TLI->getMaxExpandSizeMemcmp(OptSize);
+  Options.NumLoadsPerBlock = Options.MaxNumLoads;
+  if (ST->is64Bit())
+Options.LoadSizes.push_back(8);
+  llvm::append_range(Options.LoadSizes, ArrayRef({4, 2, 1}));
+  Options.AllowedTailExpansions = {3, 5, 6};

wangpc-pp wrote:

Yes, it seems we will generate i40/i48/... loads (see also the tests in 
https://github.com/llvm/llvm-project/pull/70469).
This may not work for RISC-V.

https://github.com/llvm/llvm-project/pull/107548
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-09-09 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/107548

>From f21cfcfc90330ee3856746b6315a81a00313b0e0 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Fri, 6 Sep 2024 17:20:51 +0800
Subject: [PATCH 1/4] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20in?=
 =?UTF-8?q?itial=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.6-beta.1
---
 .../Target/RISCV/RISCVTargetTransformInfo.cpp |  15 +
 .../Target/RISCV/RISCVTargetTransformInfo.h   |   3 +
 llvm/test/CodeGen/RISCV/memcmp.ll | 932 ++
 3 files changed, 950 insertions(+)
 create mode 100644 llvm/test/CodeGen/RISCV/memcmp.ll

diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index e809e15eacf696..ad532aadc83266 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -2113,3 +2113,18 @@ bool RISCVTTIImpl::shouldConsiderAddressTypePromotion(
   }
   return Considerable;
 }
+
+RISCVTTIImpl::TTI::MemCmpExpansionOptions
+RISCVTTIImpl::enableMemCmpExpansion(bool OptSize, bool IsZeroCmp) const {
+  TTI::MemCmpExpansionOptions Options;
+  // FIXME: Vector haven't been tested.
+  Options.AllowOverlappingLoads =
+  (ST->enableUnalignedScalarMem() || ST->enableUnalignedScalarMem());
+  Options.MaxNumLoads = TLI->getMaxExpandSizeMemcmp(OptSize);
+  Options.NumLoadsPerBlock = Options.MaxNumLoads;
+  if (ST->is64Bit())
+Options.LoadSizes.push_back(8);
+  llvm::append_range(Options.LoadSizes, ArrayRef({4, 2, 1}));
+  Options.AllowedTailExpansions = {3, 5, 6};
+  return Options;
+}
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h 
b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
index 763b89bfec0a66..ee9bed09df97f3 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
@@ -404,6 +404,9 @@ class RISCVTTIImpl : public BasicTTIImplBase {
   shouldConsiderAddressTypePromotion(const Instruction &I,
  bool &AllowPromotionWithoutCommonHeader);
   std::optional getMinPageSize() const { return 4096; }
+
+  TTI::MemCmpExpansionOptions enableMemCmpExpansion(bool OptSize,
+bool IsZeroCmp) const;
 };
 
 } // end namespace llvm
diff --git a/llvm/test/CodeGen/RISCV/memcmp.ll 
b/llvm/test/CodeGen/RISCV/memcmp.ll
new file mode 100644
index 00..652cd02e2c750a
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/memcmp.ll
@@ -0,0 +1,932 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 5
+; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 -O2 | FileCheck %s 
--check-prefix=CHECK-ALIGNED-RV32
+; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 -O2 | FileCheck %s 
--check-prefix=CHECK-ALIGNED-RV64
+; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 
-mattr=+unaligned-scalar-mem -O2 \
+; RUN:   | FileCheck %s --check-prefix=CHECK-UNALIGNED-RV32
+; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 
-mattr=+unaligned-scalar-mem -O2 \
+; RUN:   | FileCheck %s --check-prefix=CHECK-UNALIGNED-RV64
+
+declare i32 @bcmp(i8*, i8*, iXLen) nounwind readonly
+declare i32 @memcmp(i8*, i8*, iXLen) nounwind readonly
+
+define i1 @bcmp_size_15(i8* %s1, i8* %s2) {
+; CHECK-ALIGNED-RV32-LABEL: bcmp_size_15:
+; CHECK-ALIGNED-RV32:   # %bb.0: # %entry
+; CHECK-ALIGNED-RV32-NEXT:lbu a2, 1(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 0(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 2(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 3(a0)
+; CHECK-ALIGNED-RV32-NEXT:slli a2, a2, 8
+; CHECK-ALIGNED-RV32-NEXT:or a2, a2, a3
+; CHECK-ALIGNED-RV32-NEXT:slli a4, a4, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a5, a4
+; CHECK-ALIGNED-RV32-NEXT:or a2, a4, a2
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 1(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 0(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 2(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a6, 3(a1)
+; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 8
+; CHECK-ALIGNED-RV32-NEXT:or a3, a3, a4
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a6, a5
+; CHECK-ALIGNED-RV32-NEXT:or a3, a4, a3
+; CHECK-ALIGNED-RV32-NEXT:xor a2, a2, a3
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 5(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 4(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 6(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a6, 7(a0)
+; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 8
+; CHECK-ALIGNED-RV32-NEXT:or a3, a3, a4
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a6, a5
+; CHECK-ALIGNED-RV32-NEXT:or a3, a4, a3
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 5(a1)
+; CHECK-ALIGNED-RV32-NEXT

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-09-09 Thread Pengcheng Wang via llvm-branch-commits



@@ -2113,3 +2113,18 @@ bool RISCVTTIImpl::shouldConsiderAddressTypePromotion(
   }
   return Considerable;
 }
+
+RISCVTTIImpl::TTI::MemCmpExpansionOptions
+RISCVTTIImpl::enableMemCmpExpansion(bool OptSize, bool IsZeroCmp) const {
+  TTI::MemCmpExpansionOptions Options;
+  // FIXME: Vector haven't been tested.
+  Options.AllowOverlappingLoads =
+  (ST->enableUnalignedScalarMem() || ST->enableUnalignedVectorMem());
+  Options.MaxNumLoads = TLI->getMaxExpandSizeMemcmp(OptSize);
+  Options.NumLoadsPerBlock = Options.MaxNumLoads;
+  if (ST->is64Bit())
+Options.LoadSizes.push_back(8);
+  llvm::append_range(Options.LoadSizes, ArrayRef({4, 2, 1}));
+  Options.AllowedTailExpansions = {3, 5, 6};

wangpc-pp wrote:

I removed it first. If RISC-V can benefit from it, we can add it back as a 
follow-up.

https://github.com/llvm/llvm-project/pull/107548
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-09-10 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/107548

>From f21cfcfc90330ee3856746b6315a81a00313b0e0 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Fri, 6 Sep 2024 17:20:51 +0800
Subject: [PATCH 1/4] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20in?=
 =?UTF-8?q?itial=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.6-beta.1
---
 .../Target/RISCV/RISCVTargetTransformInfo.cpp |  15 +
 .../Target/RISCV/RISCVTargetTransformInfo.h   |   3 +
 llvm/test/CodeGen/RISCV/memcmp.ll | 932 ++
 3 files changed, 950 insertions(+)
 create mode 100644 llvm/test/CodeGen/RISCV/memcmp.ll

diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index e809e15eacf696..ad532aadc83266 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -2113,3 +2113,18 @@ bool RISCVTTIImpl::shouldConsiderAddressTypePromotion(
   }
   return Considerable;
 }
+
+RISCVTTIImpl::TTI::MemCmpExpansionOptions
+RISCVTTIImpl::enableMemCmpExpansion(bool OptSize, bool IsZeroCmp) const {
+  TTI::MemCmpExpansionOptions Options;
+  // FIXME: Vector haven't been tested.
+  Options.AllowOverlappingLoads =
+  (ST->enableUnalignedScalarMem() || ST->enableUnalignedScalarMem());
+  Options.MaxNumLoads = TLI->getMaxExpandSizeMemcmp(OptSize);
+  Options.NumLoadsPerBlock = Options.MaxNumLoads;
+  if (ST->is64Bit())
+Options.LoadSizes.push_back(8);
+  llvm::append_range(Options.LoadSizes, ArrayRef({4, 2, 1}));
+  Options.AllowedTailExpansions = {3, 5, 6};
+  return Options;
+}
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h 
b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
index 763b89bfec0a66..ee9bed09df97f3 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
@@ -404,6 +404,9 @@ class RISCVTTIImpl : public BasicTTIImplBase {
   shouldConsiderAddressTypePromotion(const Instruction &I,
  bool &AllowPromotionWithoutCommonHeader);
   std::optional getMinPageSize() const { return 4096; }
+
+  TTI::MemCmpExpansionOptions enableMemCmpExpansion(bool OptSize,
+bool IsZeroCmp) const;
 };
 
 } // end namespace llvm
diff --git a/llvm/test/CodeGen/RISCV/memcmp.ll 
b/llvm/test/CodeGen/RISCV/memcmp.ll
new file mode 100644
index 00..652cd02e2c750a
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/memcmp.ll
@@ -0,0 +1,932 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 5
+; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 -O2 | FileCheck %s 
--check-prefix=CHECK-ALIGNED-RV32
+; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 -O2 | FileCheck %s 
--check-prefix=CHECK-ALIGNED-RV64
+; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 
-mattr=+unaligned-scalar-mem -O2 \
+; RUN:   | FileCheck %s --check-prefix=CHECK-UNALIGNED-RV32
+; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 
-mattr=+unaligned-scalar-mem -O2 \
+; RUN:   | FileCheck %s --check-prefix=CHECK-UNALIGNED-RV64
+
+declare i32 @bcmp(i8*, i8*, iXLen) nounwind readonly
+declare i32 @memcmp(i8*, i8*, iXLen) nounwind readonly
+
+define i1 @bcmp_size_15(i8* %s1, i8* %s2) {
+; CHECK-ALIGNED-RV32-LABEL: bcmp_size_15:
+; CHECK-ALIGNED-RV32:   # %bb.0: # %entry
+; CHECK-ALIGNED-RV32-NEXT:lbu a2, 1(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 0(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 2(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 3(a0)
+; CHECK-ALIGNED-RV32-NEXT:slli a2, a2, 8
+; CHECK-ALIGNED-RV32-NEXT:or a2, a2, a3
+; CHECK-ALIGNED-RV32-NEXT:slli a4, a4, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a5, a4
+; CHECK-ALIGNED-RV32-NEXT:or a2, a4, a2
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 1(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 0(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 2(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a6, 3(a1)
+; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 8
+; CHECK-ALIGNED-RV32-NEXT:or a3, a3, a4
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a6, a5
+; CHECK-ALIGNED-RV32-NEXT:or a3, a4, a3
+; CHECK-ALIGNED-RV32-NEXT:xor a2, a2, a3
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 5(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 4(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 6(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a6, 7(a0)
+; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 8
+; CHECK-ALIGNED-RV32-NEXT:or a3, a3, a4
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a6, a5
+; CHECK-ALIGNED-RV32-NEXT:or a3, a4, a3
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 5(a1)
+; CHECK-ALIGNED-RV32-NEXT

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-09-10 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/107548

>From f21cfcfc90330ee3856746b6315a81a00313b0e0 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Fri, 6 Sep 2024 17:20:51 +0800
Subject: [PATCH 1/4] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20in?=
 =?UTF-8?q?itial=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.6-beta.1
---
 .../Target/RISCV/RISCVTargetTransformInfo.cpp |  15 +
 .../Target/RISCV/RISCVTargetTransformInfo.h   |   3 +
 llvm/test/CodeGen/RISCV/memcmp.ll | 932 ++
 3 files changed, 950 insertions(+)
 create mode 100644 llvm/test/CodeGen/RISCV/memcmp.ll

diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index e809e15eacf696..ad532aadc83266 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -2113,3 +2113,18 @@ bool RISCVTTIImpl::shouldConsiderAddressTypePromotion(
   }
   return Considerable;
 }
+
+RISCVTTIImpl::TTI::MemCmpExpansionOptions
+RISCVTTIImpl::enableMemCmpExpansion(bool OptSize, bool IsZeroCmp) const {
+  TTI::MemCmpExpansionOptions Options;
+  // FIXME: Vector haven't been tested.
+  Options.AllowOverlappingLoads =
+  (ST->enableUnalignedScalarMem() || ST->enableUnalignedScalarMem());
+  Options.MaxNumLoads = TLI->getMaxExpandSizeMemcmp(OptSize);
+  Options.NumLoadsPerBlock = Options.MaxNumLoads;
+  if (ST->is64Bit())
+Options.LoadSizes.push_back(8);
+  llvm::append_range(Options.LoadSizes, ArrayRef({4, 2, 1}));
+  Options.AllowedTailExpansions = {3, 5, 6};
+  return Options;
+}
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h 
b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
index 763b89bfec0a66..ee9bed09df97f3 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
@@ -404,6 +404,9 @@ class RISCVTTIImpl : public BasicTTIImplBase {
   shouldConsiderAddressTypePromotion(const Instruction &I,
  bool &AllowPromotionWithoutCommonHeader);
   std::optional getMinPageSize() const { return 4096; }
+
+  TTI::MemCmpExpansionOptions enableMemCmpExpansion(bool OptSize,
+bool IsZeroCmp) const;
 };
 
 } // end namespace llvm
diff --git a/llvm/test/CodeGen/RISCV/memcmp.ll 
b/llvm/test/CodeGen/RISCV/memcmp.ll
new file mode 100644
index 00..652cd02e2c750a
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/memcmp.ll
@@ -0,0 +1,932 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 5
+; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 -O2 | FileCheck %s 
--check-prefix=CHECK-ALIGNED-RV32
+; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 -O2 | FileCheck %s 
--check-prefix=CHECK-ALIGNED-RV64
+; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 
-mattr=+unaligned-scalar-mem -O2 \
+; RUN:   | FileCheck %s --check-prefix=CHECK-UNALIGNED-RV32
+; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 
-mattr=+unaligned-scalar-mem -O2 \
+; RUN:   | FileCheck %s --check-prefix=CHECK-UNALIGNED-RV64
+
+declare i32 @bcmp(i8*, i8*, iXLen) nounwind readonly
+declare i32 @memcmp(i8*, i8*, iXLen) nounwind readonly
+
+define i1 @bcmp_size_15(i8* %s1, i8* %s2) {
+; CHECK-ALIGNED-RV32-LABEL: bcmp_size_15:
+; CHECK-ALIGNED-RV32:   # %bb.0: # %entry
+; CHECK-ALIGNED-RV32-NEXT:lbu a2, 1(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 0(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 2(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 3(a0)
+; CHECK-ALIGNED-RV32-NEXT:slli a2, a2, 8
+; CHECK-ALIGNED-RV32-NEXT:or a2, a2, a3
+; CHECK-ALIGNED-RV32-NEXT:slli a4, a4, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a5, a4
+; CHECK-ALIGNED-RV32-NEXT:or a2, a4, a2
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 1(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 0(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 2(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a6, 3(a1)
+; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 8
+; CHECK-ALIGNED-RV32-NEXT:or a3, a3, a4
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a6, a5
+; CHECK-ALIGNED-RV32-NEXT:or a3, a4, a3
+; CHECK-ALIGNED-RV32-NEXT:xor a2, a2, a3
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 5(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 4(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 6(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a6, 7(a0)
+; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 8
+; CHECK-ALIGNED-RV32-NEXT:or a3, a3, a4
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a6, a5
+; CHECK-ALIGNED-RV32-NEXT:or a3, a4, a3
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 5(a1)
+; CHECK-ALIGNED-RV32-NEXT

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-09-10 Thread Pengcheng Wang via llvm-branch-commits


wangpc-pp wrote:

> This is perhaps more of a comment for #107824 than for this one, but I think 
> we'd benefit from some level of test coverage for OptForSize to demonstrate 
> that we stick with the libcall in cases where expanding increases code size.

Thanks! Done!

https://github.com/llvm/llvm-project/pull/107548
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-09-11 Thread Pengcheng Wang via llvm-branch-commits


wangpc-pp wrote:

> I'm working on getting some runtime numbers now, sorry for the delay.

No need to say sorry, I really appreciate your help! 😃 

https://github.com/llvm/llvm-project/pull/107548
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/19.x: [RISCV] Don't outline pcrel_lo when the function has a section prefix (#107943) (PR #108288)

2024-09-11 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp approved this pull request.

LGTM.
This fixes a bug that exists for a long time.

https://github.com/llvm/llvm-project/pull/108288
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-09-12 Thread Pengcheng Wang via llvm-branch-commits


wangpc-pp wrote:

> The run just finished, I'm seeing a 0.75% improvement on 500.perlbench_r on 
> the BPI F3 (-O3 -mcpu=spacemit-x60), no regressions or improvements on the 
> other benchmarks as far as I can see. Seems to check out with the number of 
> memcmps inlined reported for perlbench!

Thanks a lot! The result is within my expectation.
Is it OK to merge? The next step will be tuning for vectors.

https://github.com/llvm/llvm-project/pull/107548
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-09-12 Thread Pengcheng Wang via llvm-branch-commits


wangpc-pp wrote:

> > The run just finished, I'm seeing a 0.75% improvement on 500.perlbench_r on 
> > the BPI F3 (-O3 -mcpu=spacemit-x60), no regressions or improvements on the 
> > other benchmarks as far as I can see. Seems to check out with the number of 
> > memcmps inlined reported for perlbench!
> 
> Does spacemit-x60 support unaligned scalar memory and was your test with or 
> without that enabled?

It supports unaligned scalar but not unaligned vector. And it seems we don't 
add these features to `-mcpu=spacemit-x60`. So I think @lukel97 ran the SPEC 
without unaligned scalar.

https://github.com/llvm/llvm-project/pull/107548
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-09-12 Thread Pengcheng Wang via llvm-branch-commits


wangpc-pp wrote:

> > > > The run just finished, I'm seeing a 0.75% improvement on 
> > > > 500.perlbench_r on the BPI F3 (-O3 -mcpu=spacemit-x60), no regressions 
> > > > or improvements on the other benchmarks as far as I can see. Seems to 
> > > > check out with the number of memcmps inlined reported for perlbench!
> > 
> > > 
> > 
> > > Does spacemit-x60 support unaligned scalar memory and was your test with 
> > > or without that enabled?
> > 
> > 
> > 
> > It supports unaligned scalar but not unaligned vector. And it seems we 
> > don't add these features to `-mcpu=spacemit-x60`. So I think @lukel97 ran 
> > the SPEC without unaligned scalar.
> 
> Yeah, -mno-strict-align gave a bus error. I ultimately built it without 
> unaligned scalar since I wasn't sure if unaligned scalar was performant or 
> not. 

IIRC, we have separated this into two options(scalar/vector) now. So maybe we 
can specify the scalar one.

https://github.com/llvm/llvm-project/pull/107548
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-09-12 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/107548

>From f21cfcfc90330ee3856746b6315a81a00313b0e0 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Fri, 6 Sep 2024 17:20:51 +0800
Subject: [PATCH 1/5] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20in?=
 =?UTF-8?q?itial=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.6-beta.1
---
 .../Target/RISCV/RISCVTargetTransformInfo.cpp |  15 +
 .../Target/RISCV/RISCVTargetTransformInfo.h   |   3 +
 llvm/test/CodeGen/RISCV/memcmp.ll | 932 ++
 3 files changed, 950 insertions(+)
 create mode 100644 llvm/test/CodeGen/RISCV/memcmp.ll

diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index e809e15eacf696..ad532aadc83266 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -2113,3 +2113,18 @@ bool RISCVTTIImpl::shouldConsiderAddressTypePromotion(
   }
   return Considerable;
 }
+
+RISCVTTIImpl::TTI::MemCmpExpansionOptions
+RISCVTTIImpl::enableMemCmpExpansion(bool OptSize, bool IsZeroCmp) const {
+  TTI::MemCmpExpansionOptions Options;
+  // FIXME: Vector haven't been tested.
+  Options.AllowOverlappingLoads =
+  (ST->enableUnalignedScalarMem() || ST->enableUnalignedScalarMem());
+  Options.MaxNumLoads = TLI->getMaxExpandSizeMemcmp(OptSize);
+  Options.NumLoadsPerBlock = Options.MaxNumLoads;
+  if (ST->is64Bit())
+Options.LoadSizes.push_back(8);
+  llvm::append_range(Options.LoadSizes, ArrayRef({4, 2, 1}));
+  Options.AllowedTailExpansions = {3, 5, 6};
+  return Options;
+}
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h 
b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
index 763b89bfec0a66..ee9bed09df97f3 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
@@ -404,6 +404,9 @@ class RISCVTTIImpl : public BasicTTIImplBase {
   shouldConsiderAddressTypePromotion(const Instruction &I,
  bool &AllowPromotionWithoutCommonHeader);
   std::optional getMinPageSize() const { return 4096; }
+
+  TTI::MemCmpExpansionOptions enableMemCmpExpansion(bool OptSize,
+bool IsZeroCmp) const;
 };
 
 } // end namespace llvm
diff --git a/llvm/test/CodeGen/RISCV/memcmp.ll 
b/llvm/test/CodeGen/RISCV/memcmp.ll
new file mode 100644
index 00..652cd02e2c750a
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/memcmp.ll
@@ -0,0 +1,932 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 5
+; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 -O2 | FileCheck %s 
--check-prefix=CHECK-ALIGNED-RV32
+; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 -O2 | FileCheck %s 
--check-prefix=CHECK-ALIGNED-RV64
+; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 
-mattr=+unaligned-scalar-mem -O2 \
+; RUN:   | FileCheck %s --check-prefix=CHECK-UNALIGNED-RV32
+; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 
-mattr=+unaligned-scalar-mem -O2 \
+; RUN:   | FileCheck %s --check-prefix=CHECK-UNALIGNED-RV64
+
+declare i32 @bcmp(i8*, i8*, iXLen) nounwind readonly
+declare i32 @memcmp(i8*, i8*, iXLen) nounwind readonly
+
+define i1 @bcmp_size_15(i8* %s1, i8* %s2) {
+; CHECK-ALIGNED-RV32-LABEL: bcmp_size_15:
+; CHECK-ALIGNED-RV32:   # %bb.0: # %entry
+; CHECK-ALIGNED-RV32-NEXT:lbu a2, 1(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 0(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 2(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 3(a0)
+; CHECK-ALIGNED-RV32-NEXT:slli a2, a2, 8
+; CHECK-ALIGNED-RV32-NEXT:or a2, a2, a3
+; CHECK-ALIGNED-RV32-NEXT:slli a4, a4, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a5, a4
+; CHECK-ALIGNED-RV32-NEXT:or a2, a4, a2
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 1(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 0(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 2(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a6, 3(a1)
+; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 8
+; CHECK-ALIGNED-RV32-NEXT:or a3, a3, a4
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a6, a5
+; CHECK-ALIGNED-RV32-NEXT:or a3, a4, a3
+; CHECK-ALIGNED-RV32-NEXT:xor a2, a2, a3
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 5(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 4(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 6(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a6, 7(a0)
+; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 8
+; CHECK-ALIGNED-RV32-NEXT:or a3, a3, a4
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a6, a5
+; CHECK-ALIGNED-RV32-NEXT:or a3, a4, a3
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 5(a1)
+; CHECK-ALIGNED-RV32-NEXT

[llvm-branch-commits] [llvm] release/19.x: [RISCV] Don't outline pcrel_lo when the function has a section prefix (#107943) (PR #108288)

2024-09-12 Thread Pengcheng Wang via llvm-branch-commits


wangpc-pp wrote:

> Hi, since we are wrapping up LLVM 19.1.0 we are very strict with the fixes we 
> pick at this point. Can you please respond to the following questions to help 
> me understand if this has to be included in the final release or not.
> 
> Is this PR a fix for a regression or a critical issue?

This PR is a fix for https://github.com/llvm/llvm-project/issues/107520, which 
is a correctness issue about MachineOutliner for RISC-V target.

> 
> What is the risk of accepting this into the release branch?

No risk I think. It fixes a correctnesss issue.

> 
> What is the risk of NOT accepting this into the release branch?

If not, for some cases, we will generate wrong outling results for RISC-V 
target.

https://github.com/llvm/llvm-project/pull/108288
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-10-09 Thread Pengcheng Wang via llvm-branch-commits


wangpc-pp wrote:

Ping.

And some updates on vector support: currently, `ExpandMemCmp` will only 
generate integer types (`i128`, `i256`) and so on. So, if we simply add `128`, 
`256`, `vlen` to `LoadSizes`, the LLVM IR will use i128/i256/... and then they 
are expanded to legal scalar types as we don't mark i128/i256/... legal when 
RVV exists.

There are two ways to fix this:
1. Make `ExpandMemCmp` geenrate vector types/operations.
2. Make i128/i256/... legal.

I think the first one is the right approach but I need some agreements on this.

https://github.com/llvm/llvm-project/pull/107548
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-10-09 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/107548

>From f21cfcfc90330ee3856746b6315a81a00313b0e0 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Fri, 6 Sep 2024 17:20:51 +0800
Subject: [PATCH 1/5] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20in?=
 =?UTF-8?q?itial=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.6-beta.1
---
 .../Target/RISCV/RISCVTargetTransformInfo.cpp |  15 +
 .../Target/RISCV/RISCVTargetTransformInfo.h   |   3 +
 llvm/test/CodeGen/RISCV/memcmp.ll | 932 ++
 3 files changed, 950 insertions(+)
 create mode 100644 llvm/test/CodeGen/RISCV/memcmp.ll

diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index e809e15eacf696..ad532aadc83266 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -2113,3 +2113,18 @@ bool RISCVTTIImpl::shouldConsiderAddressTypePromotion(
   }
   return Considerable;
 }
+
+RISCVTTIImpl::TTI::MemCmpExpansionOptions
+RISCVTTIImpl::enableMemCmpExpansion(bool OptSize, bool IsZeroCmp) const {
+  TTI::MemCmpExpansionOptions Options;
+  // FIXME: Vector haven't been tested.
+  Options.AllowOverlappingLoads =
+  (ST->enableUnalignedScalarMem() || ST->enableUnalignedScalarMem());
+  Options.MaxNumLoads = TLI->getMaxExpandSizeMemcmp(OptSize);
+  Options.NumLoadsPerBlock = Options.MaxNumLoads;
+  if (ST->is64Bit())
+Options.LoadSizes.push_back(8);
+  llvm::append_range(Options.LoadSizes, ArrayRef({4, 2, 1}));
+  Options.AllowedTailExpansions = {3, 5, 6};
+  return Options;
+}
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h 
b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
index 763b89bfec0a66..ee9bed09df97f3 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
@@ -404,6 +404,9 @@ class RISCVTTIImpl : public BasicTTIImplBase {
   shouldConsiderAddressTypePromotion(const Instruction &I,
  bool &AllowPromotionWithoutCommonHeader);
   std::optional getMinPageSize() const { return 4096; }
+
+  TTI::MemCmpExpansionOptions enableMemCmpExpansion(bool OptSize,
+bool IsZeroCmp) const;
 };
 
 } // end namespace llvm
diff --git a/llvm/test/CodeGen/RISCV/memcmp.ll 
b/llvm/test/CodeGen/RISCV/memcmp.ll
new file mode 100644
index 00..652cd02e2c750a
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/memcmp.ll
@@ -0,0 +1,932 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 5
+; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 -O2 | FileCheck %s 
--check-prefix=CHECK-ALIGNED-RV32
+; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 -O2 | FileCheck %s 
--check-prefix=CHECK-ALIGNED-RV64
+; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 
-mattr=+unaligned-scalar-mem -O2 \
+; RUN:   | FileCheck %s --check-prefix=CHECK-UNALIGNED-RV32
+; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 
-mattr=+unaligned-scalar-mem -O2 \
+; RUN:   | FileCheck %s --check-prefix=CHECK-UNALIGNED-RV64
+
+declare i32 @bcmp(i8*, i8*, iXLen) nounwind readonly
+declare i32 @memcmp(i8*, i8*, iXLen) nounwind readonly
+
+define i1 @bcmp_size_15(i8* %s1, i8* %s2) {
+; CHECK-ALIGNED-RV32-LABEL: bcmp_size_15:
+; CHECK-ALIGNED-RV32:   # %bb.0: # %entry
+; CHECK-ALIGNED-RV32-NEXT:lbu a2, 1(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 0(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 2(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 3(a0)
+; CHECK-ALIGNED-RV32-NEXT:slli a2, a2, 8
+; CHECK-ALIGNED-RV32-NEXT:or a2, a2, a3
+; CHECK-ALIGNED-RV32-NEXT:slli a4, a4, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a5, a4
+; CHECK-ALIGNED-RV32-NEXT:or a2, a4, a2
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 1(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 0(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 2(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a6, 3(a1)
+; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 8
+; CHECK-ALIGNED-RV32-NEXT:or a3, a3, a4
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a6, a5
+; CHECK-ALIGNED-RV32-NEXT:or a3, a4, a3
+; CHECK-ALIGNED-RV32-NEXT:xor a2, a2, a3
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 5(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 4(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 6(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a6, 7(a0)
+; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 8
+; CHECK-ALIGNED-RV32-NEXT:or a3, a3, a4
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a6, a5
+; CHECK-ALIGNED-RV32-NEXT:or a3, a4, a3
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 5(a1)
+; CHECK-ALIGNED-RV32-NEXT

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-10-09 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/107548

>From f21cfcfc90330ee3856746b6315a81a00313b0e0 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Fri, 6 Sep 2024 17:20:51 +0800
Subject: [PATCH 1/5] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20in?=
 =?UTF-8?q?itial=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.6-beta.1
---
 .../Target/RISCV/RISCVTargetTransformInfo.cpp |  15 +
 .../Target/RISCV/RISCVTargetTransformInfo.h   |   3 +
 llvm/test/CodeGen/RISCV/memcmp.ll | 932 ++
 3 files changed, 950 insertions(+)
 create mode 100644 llvm/test/CodeGen/RISCV/memcmp.ll

diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index e809e15eacf696..ad532aadc83266 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -2113,3 +2113,18 @@ bool RISCVTTIImpl::shouldConsiderAddressTypePromotion(
   }
   return Considerable;
 }
+
+RISCVTTIImpl::TTI::MemCmpExpansionOptions
+RISCVTTIImpl::enableMemCmpExpansion(bool OptSize, bool IsZeroCmp) const {
+  TTI::MemCmpExpansionOptions Options;
+  // FIXME: Vector haven't been tested.
+  Options.AllowOverlappingLoads =
+  (ST->enableUnalignedScalarMem() || ST->enableUnalignedScalarMem());
+  Options.MaxNumLoads = TLI->getMaxExpandSizeMemcmp(OptSize);
+  Options.NumLoadsPerBlock = Options.MaxNumLoads;
+  if (ST->is64Bit())
+Options.LoadSizes.push_back(8);
+  llvm::append_range(Options.LoadSizes, ArrayRef({4, 2, 1}));
+  Options.AllowedTailExpansions = {3, 5, 6};
+  return Options;
+}
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h 
b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
index 763b89bfec0a66..ee9bed09df97f3 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
@@ -404,6 +404,9 @@ class RISCVTTIImpl : public BasicTTIImplBase {
   shouldConsiderAddressTypePromotion(const Instruction &I,
  bool &AllowPromotionWithoutCommonHeader);
   std::optional getMinPageSize() const { return 4096; }
+
+  TTI::MemCmpExpansionOptions enableMemCmpExpansion(bool OptSize,
+bool IsZeroCmp) const;
 };
 
 } // end namespace llvm
diff --git a/llvm/test/CodeGen/RISCV/memcmp.ll 
b/llvm/test/CodeGen/RISCV/memcmp.ll
new file mode 100644
index 00..652cd02e2c750a
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/memcmp.ll
@@ -0,0 +1,932 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 5
+; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 -O2 | FileCheck %s 
--check-prefix=CHECK-ALIGNED-RV32
+; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 -O2 | FileCheck %s 
--check-prefix=CHECK-ALIGNED-RV64
+; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 
-mattr=+unaligned-scalar-mem -O2 \
+; RUN:   | FileCheck %s --check-prefix=CHECK-UNALIGNED-RV32
+; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 
-mattr=+unaligned-scalar-mem -O2 \
+; RUN:   | FileCheck %s --check-prefix=CHECK-UNALIGNED-RV64
+
+declare i32 @bcmp(i8*, i8*, iXLen) nounwind readonly
+declare i32 @memcmp(i8*, i8*, iXLen) nounwind readonly
+
+define i1 @bcmp_size_15(i8* %s1, i8* %s2) {
+; CHECK-ALIGNED-RV32-LABEL: bcmp_size_15:
+; CHECK-ALIGNED-RV32:   # %bb.0: # %entry
+; CHECK-ALIGNED-RV32-NEXT:lbu a2, 1(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 0(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 2(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 3(a0)
+; CHECK-ALIGNED-RV32-NEXT:slli a2, a2, 8
+; CHECK-ALIGNED-RV32-NEXT:or a2, a2, a3
+; CHECK-ALIGNED-RV32-NEXT:slli a4, a4, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a5, a4
+; CHECK-ALIGNED-RV32-NEXT:or a2, a4, a2
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 1(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 0(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 2(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a6, 3(a1)
+; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 8
+; CHECK-ALIGNED-RV32-NEXT:or a3, a3, a4
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a6, a5
+; CHECK-ALIGNED-RV32-NEXT:or a3, a4, a3
+; CHECK-ALIGNED-RV32-NEXT:xor a2, a2, a3
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 5(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 4(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 6(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a6, 7(a0)
+; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 8
+; CHECK-ALIGNED-RV32-NEXT:or a3, a3, a4
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a6, a5
+; CHECK-ALIGNED-RV32-NEXT:or a3, a4, a3
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 5(a1)
+; CHECK-ALIGNED-RV32-NEXT

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-10-11 Thread Pengcheng Wang via llvm-branch-commits



@@ -1144,42 +2872,116 @@ entry:
 define i32 @memcmp_size_4(ptr %s1, ptr %s2) nounwind {
 ; CHECK-ALIGNED-RV32-LABEL: memcmp_size_4:
 ; CHECK-ALIGNED-RV32:   # %bb.0: # %entry
-; CHECK-ALIGNED-RV32-NEXT:addi sp, sp, -16
-; CHECK-ALIGNED-RV32-NEXT:sw ra, 12(sp) # 4-byte Folded Spill
-; CHECK-ALIGNED-RV32-NEXT:li a2, 4
-; CHECK-ALIGNED-RV32-NEXT:call memcmp
-; CHECK-ALIGNED-RV32-NEXT:lw ra, 12(sp) # 4-byte Folded Reload
-; CHECK-ALIGNED-RV32-NEXT:addi sp, sp, 16
+; CHECK-ALIGNED-RV32-NEXT:lbu a2, 0(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 1(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 3(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a0, 2(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 0(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a6, 1(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a7, 3(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a1, 2(a1)
+; CHECK-ALIGNED-RV32-NEXT:slli a0, a0, 8
+; CHECK-ALIGNED-RV32-NEXT:or a0, a0, a4
+; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a2, a2, 24
+; CHECK-ALIGNED-RV32-NEXT:or a2, a2, a3
+; CHECK-ALIGNED-RV32-NEXT:or a0, a2, a0
+; CHECK-ALIGNED-RV32-NEXT:slli a1, a1, 8
+; CHECK-ALIGNED-RV32-NEXT:or a1, a1, a7
+; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 24
+; CHECK-ALIGNED-RV32-NEXT:or a2, a5, a6
+; CHECK-ALIGNED-RV32-NEXT:or a1, a2, a1
+; CHECK-ALIGNED-RV32-NEXT:sltu a2, a1, a0
+; CHECK-ALIGNED-RV32-NEXT:sltu a0, a0, a1
+; CHECK-ALIGNED-RV32-NEXT:sub a0, a2, a0
 ; CHECK-ALIGNED-RV32-NEXT:ret
 ;
 ; CHECK-ALIGNED-RV64-LABEL: memcmp_size_4:
 ; CHECK-ALIGNED-RV64:   # %bb.0: # %entry
-; CHECK-ALIGNED-RV64-NEXT:addi sp, sp, -16
-; CHECK-ALIGNED-RV64-NEXT:sd ra, 8(sp) # 8-byte Folded Spill
-; CHECK-ALIGNED-RV64-NEXT:li a2, 4
-; CHECK-ALIGNED-RV64-NEXT:call memcmp
-; CHECK-ALIGNED-RV64-NEXT:ld ra, 8(sp) # 8-byte Folded Reload
-; CHECK-ALIGNED-RV64-NEXT:addi sp, sp, 16
+; CHECK-ALIGNED-RV64-NEXT:lbu a2, 0(a0)
+; CHECK-ALIGNED-RV64-NEXT:lbu a3, 1(a0)
+; CHECK-ALIGNED-RV64-NEXT:lbu a4, 2(a0)
+; CHECK-ALIGNED-RV64-NEXT:lb a0, 3(a0)
+; CHECK-ALIGNED-RV64-NEXT:lbu a5, 0(a1)
+; CHECK-ALIGNED-RV64-NEXT:lbu a6, 1(a1)
+; CHECK-ALIGNED-RV64-NEXT:lbu a7, 2(a1)
+; CHECK-ALIGNED-RV64-NEXT:lb a1, 3(a1)
+; CHECK-ALIGNED-RV64-NEXT:andi a0, a0, 255
+; CHECK-ALIGNED-RV64-NEXT:slli a4, a4, 8
+; CHECK-ALIGNED-RV64-NEXT:or a0, a4, a0
+; CHECK-ALIGNED-RV64-NEXT:slli a3, a3, 16
+; CHECK-ALIGNED-RV64-NEXT:slliw a2, a2, 24
+; CHECK-ALIGNED-RV64-NEXT:or a2, a2, a3
+; CHECK-ALIGNED-RV64-NEXT:or a0, a2, a0
+; CHECK-ALIGNED-RV64-NEXT:andi a1, a1, 255
+; CHECK-ALIGNED-RV64-NEXT:slli a7, a7, 8
+; CHECK-ALIGNED-RV64-NEXT:or a1, a7, a1
+; CHECK-ALIGNED-RV64-NEXT:slli a6, a6, 16
+; CHECK-ALIGNED-RV64-NEXT:slliw a2, a5, 24
+; CHECK-ALIGNED-RV64-NEXT:or a2, a2, a6
+; CHECK-ALIGNED-RV64-NEXT:or a1, a2, a1
+; CHECK-ALIGNED-RV64-NEXT:sltu a2, a1, a0
+; CHECK-ALIGNED-RV64-NEXT:sltu a0, a0, a1
+; CHECK-ALIGNED-RV64-NEXT:sub a0, a2, a0
 ; CHECK-ALIGNED-RV64-NEXT:ret
 ;
 ; CHECK-UNALIGNED-RV32-LABEL: memcmp_size_4:
 ; CHECK-UNALIGNED-RV32:   # %bb.0: # %entry
-; CHECK-UNALIGNED-RV32-NEXT:addi sp, sp, -16
-; CHECK-UNALIGNED-RV32-NEXT:sw ra, 12(sp) # 4-byte Folded Spill
-; CHECK-UNALIGNED-RV32-NEXT:li a2, 4
-; CHECK-UNALIGNED-RV32-NEXT:call memcmp
-; CHECK-UNALIGNED-RV32-NEXT:lw ra, 12(sp) # 4-byte Folded Reload
-; CHECK-UNALIGNED-RV32-NEXT:addi sp, sp, 16
+; CHECK-UNALIGNED-RV32-NEXT:lw a0, 0(a0)

wangpc-pp wrote:

Good question, I don't know the answer, but I think we can be aggressive here 
and enable it for all configurations. If we have some evidences that shows 
inefficiencies, we can go back to here and disable it. WDYT?

https://github.com/llvm/llvm-project/pull/107548
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-10-11 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/107548

>From f21cfcfc90330ee3856746b6315a81a00313b0e0 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Fri, 6 Sep 2024 17:20:51 +0800
Subject: [PATCH 1/5] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20in?=
 =?UTF-8?q?itial=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.6-beta.1
---
 .../Target/RISCV/RISCVTargetTransformInfo.cpp |  15 +
 .../Target/RISCV/RISCVTargetTransformInfo.h   |   3 +
 llvm/test/CodeGen/RISCV/memcmp.ll | 932 ++
 3 files changed, 950 insertions(+)
 create mode 100644 llvm/test/CodeGen/RISCV/memcmp.ll

diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index e809e15eacf696..ad532aadc83266 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -2113,3 +2113,18 @@ bool RISCVTTIImpl::shouldConsiderAddressTypePromotion(
   }
   return Considerable;
 }
+
+RISCVTTIImpl::TTI::MemCmpExpansionOptions
+RISCVTTIImpl::enableMemCmpExpansion(bool OptSize, bool IsZeroCmp) const {
+  TTI::MemCmpExpansionOptions Options;
+  // FIXME: Vector haven't been tested.
+  Options.AllowOverlappingLoads =
+  (ST->enableUnalignedScalarMem() || ST->enableUnalignedScalarMem());
+  Options.MaxNumLoads = TLI->getMaxExpandSizeMemcmp(OptSize);
+  Options.NumLoadsPerBlock = Options.MaxNumLoads;
+  if (ST->is64Bit())
+Options.LoadSizes.push_back(8);
+  llvm::append_range(Options.LoadSizes, ArrayRef({4, 2, 1}));
+  Options.AllowedTailExpansions = {3, 5, 6};
+  return Options;
+}
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h 
b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
index 763b89bfec0a66..ee9bed09df97f3 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
@@ -404,6 +404,9 @@ class RISCVTTIImpl : public BasicTTIImplBase {
   shouldConsiderAddressTypePromotion(const Instruction &I,
  bool &AllowPromotionWithoutCommonHeader);
   std::optional getMinPageSize() const { return 4096; }
+
+  TTI::MemCmpExpansionOptions enableMemCmpExpansion(bool OptSize,
+bool IsZeroCmp) const;
 };
 
 } // end namespace llvm
diff --git a/llvm/test/CodeGen/RISCV/memcmp.ll 
b/llvm/test/CodeGen/RISCV/memcmp.ll
new file mode 100644
index 00..652cd02e2c750a
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/memcmp.ll
@@ -0,0 +1,932 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 5
+; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 -O2 | FileCheck %s 
--check-prefix=CHECK-ALIGNED-RV32
+; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 -O2 | FileCheck %s 
--check-prefix=CHECK-ALIGNED-RV64
+; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 
-mattr=+unaligned-scalar-mem -O2 \
+; RUN:   | FileCheck %s --check-prefix=CHECK-UNALIGNED-RV32
+; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 
-mattr=+unaligned-scalar-mem -O2 \
+; RUN:   | FileCheck %s --check-prefix=CHECK-UNALIGNED-RV64
+
+declare i32 @bcmp(i8*, i8*, iXLen) nounwind readonly
+declare i32 @memcmp(i8*, i8*, iXLen) nounwind readonly
+
+define i1 @bcmp_size_15(i8* %s1, i8* %s2) {
+; CHECK-ALIGNED-RV32-LABEL: bcmp_size_15:
+; CHECK-ALIGNED-RV32:   # %bb.0: # %entry
+; CHECK-ALIGNED-RV32-NEXT:lbu a2, 1(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 0(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 2(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 3(a0)
+; CHECK-ALIGNED-RV32-NEXT:slli a2, a2, 8
+; CHECK-ALIGNED-RV32-NEXT:or a2, a2, a3
+; CHECK-ALIGNED-RV32-NEXT:slli a4, a4, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a5, a4
+; CHECK-ALIGNED-RV32-NEXT:or a2, a4, a2
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 1(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 0(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 2(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a6, 3(a1)
+; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 8
+; CHECK-ALIGNED-RV32-NEXT:or a3, a3, a4
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a6, a5
+; CHECK-ALIGNED-RV32-NEXT:or a3, a4, a3
+; CHECK-ALIGNED-RV32-NEXT:xor a2, a2, a3
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 5(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 4(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 6(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a6, 7(a0)
+; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 8
+; CHECK-ALIGNED-RV32-NEXT:or a3, a3, a4
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a6, a5
+; CHECK-ALIGNED-RV32-NEXT:or a3, a4, a3
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 5(a1)
+; CHECK-ALIGNED-RV32-NEXT

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-10-11 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/107548

>From f21cfcfc90330ee3856746b6315a81a00313b0e0 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Fri, 6 Sep 2024 17:20:51 +0800
Subject: [PATCH 1/5] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20in?=
 =?UTF-8?q?itial=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.6-beta.1
---
 .../Target/RISCV/RISCVTargetTransformInfo.cpp |  15 +
 .../Target/RISCV/RISCVTargetTransformInfo.h   |   3 +
 llvm/test/CodeGen/RISCV/memcmp.ll | 932 ++
 3 files changed, 950 insertions(+)
 create mode 100644 llvm/test/CodeGen/RISCV/memcmp.ll

diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index e809e15eacf696..ad532aadc83266 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -2113,3 +2113,18 @@ bool RISCVTTIImpl::shouldConsiderAddressTypePromotion(
   }
   return Considerable;
 }
+
+RISCVTTIImpl::TTI::MemCmpExpansionOptions
+RISCVTTIImpl::enableMemCmpExpansion(bool OptSize, bool IsZeroCmp) const {
+  TTI::MemCmpExpansionOptions Options;
+  // FIXME: Vector haven't been tested.
+  Options.AllowOverlappingLoads =
+  (ST->enableUnalignedScalarMem() || ST->enableUnalignedScalarMem());
+  Options.MaxNumLoads = TLI->getMaxExpandSizeMemcmp(OptSize);
+  Options.NumLoadsPerBlock = Options.MaxNumLoads;
+  if (ST->is64Bit())
+Options.LoadSizes.push_back(8);
+  llvm::append_range(Options.LoadSizes, ArrayRef({4, 2, 1}));
+  Options.AllowedTailExpansions = {3, 5, 6};
+  return Options;
+}
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h 
b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
index 763b89bfec0a66..ee9bed09df97f3 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
@@ -404,6 +404,9 @@ class RISCVTTIImpl : public BasicTTIImplBase {
   shouldConsiderAddressTypePromotion(const Instruction &I,
  bool &AllowPromotionWithoutCommonHeader);
   std::optional getMinPageSize() const { return 4096; }
+
+  TTI::MemCmpExpansionOptions enableMemCmpExpansion(bool OptSize,
+bool IsZeroCmp) const;
 };
 
 } // end namespace llvm
diff --git a/llvm/test/CodeGen/RISCV/memcmp.ll 
b/llvm/test/CodeGen/RISCV/memcmp.ll
new file mode 100644
index 00..652cd02e2c750a
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/memcmp.ll
@@ -0,0 +1,932 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 5
+; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 -O2 | FileCheck %s 
--check-prefix=CHECK-ALIGNED-RV32
+; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 -O2 | FileCheck %s 
--check-prefix=CHECK-ALIGNED-RV64
+; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 
-mattr=+unaligned-scalar-mem -O2 \
+; RUN:   | FileCheck %s --check-prefix=CHECK-UNALIGNED-RV32
+; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 
-mattr=+unaligned-scalar-mem -O2 \
+; RUN:   | FileCheck %s --check-prefix=CHECK-UNALIGNED-RV64
+
+declare i32 @bcmp(i8*, i8*, iXLen) nounwind readonly
+declare i32 @memcmp(i8*, i8*, iXLen) nounwind readonly
+
+define i1 @bcmp_size_15(i8* %s1, i8* %s2) {
+; CHECK-ALIGNED-RV32-LABEL: bcmp_size_15:
+; CHECK-ALIGNED-RV32:   # %bb.0: # %entry
+; CHECK-ALIGNED-RV32-NEXT:lbu a2, 1(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 0(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 2(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 3(a0)
+; CHECK-ALIGNED-RV32-NEXT:slli a2, a2, 8
+; CHECK-ALIGNED-RV32-NEXT:or a2, a2, a3
+; CHECK-ALIGNED-RV32-NEXT:slli a4, a4, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a5, a4
+; CHECK-ALIGNED-RV32-NEXT:or a2, a4, a2
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 1(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 0(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 2(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a6, 3(a1)
+; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 8
+; CHECK-ALIGNED-RV32-NEXT:or a3, a3, a4
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a6, a5
+; CHECK-ALIGNED-RV32-NEXT:or a3, a4, a3
+; CHECK-ALIGNED-RV32-NEXT:xor a2, a2, a3
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 5(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 4(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 6(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a6, 7(a0)
+; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 8
+; CHECK-ALIGNED-RV32-NEXT:or a3, a3, a4
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a6, a5
+; CHECK-ALIGNED-RV32-NEXT:or a3, a4, a3
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 5(a1)
+; CHECK-ALIGNED-RV32-NEXT

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-10-11 Thread Pengcheng Wang via llvm-branch-commits



@@ -112,42 +104,46 @@ entry:
 define i32 @bcmp_size_2(ptr %s1, ptr %s2) nounwind optsize {
 ; CHECK-ALIGNED-RV32-LABEL: bcmp_size_2:
 ; CHECK-ALIGNED-RV32:   # %bb.0: # %entry
-; CHECK-ALIGNED-RV32-NEXT:addi sp, sp, -16
-; CHECK-ALIGNED-RV32-NEXT:sw ra, 12(sp) # 4-byte Folded Spill
-; CHECK-ALIGNED-RV32-NEXT:li a2, 2
-; CHECK-ALIGNED-RV32-NEXT:call bcmp
-; CHECK-ALIGNED-RV32-NEXT:lw ra, 12(sp) # 4-byte Folded Reload
-; CHECK-ALIGNED-RV32-NEXT:addi sp, sp, 16
+; CHECK-ALIGNED-RV32-NEXT:lbu a2, 1(a0)

wangpc-pp wrote:

Great! This is an optimization we can do in ExpandMemcmp, I will take a look.

https://github.com/llvm/llvm-project/pull/107548
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-10-10 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/107548

>From f21cfcfc90330ee3856746b6315a81a00313b0e0 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Fri, 6 Sep 2024 17:20:51 +0800
Subject: [PATCH 1/5] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20in?=
 =?UTF-8?q?itial=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.6-beta.1
---
 .../Target/RISCV/RISCVTargetTransformInfo.cpp |  15 +
 .../Target/RISCV/RISCVTargetTransformInfo.h   |   3 +
 llvm/test/CodeGen/RISCV/memcmp.ll | 932 ++
 3 files changed, 950 insertions(+)
 create mode 100644 llvm/test/CodeGen/RISCV/memcmp.ll

diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index e809e15eacf696..ad532aadc83266 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -2113,3 +2113,18 @@ bool RISCVTTIImpl::shouldConsiderAddressTypePromotion(
   }
   return Considerable;
 }
+
+RISCVTTIImpl::TTI::MemCmpExpansionOptions
+RISCVTTIImpl::enableMemCmpExpansion(bool OptSize, bool IsZeroCmp) const {
+  TTI::MemCmpExpansionOptions Options;
+  // FIXME: Vector haven't been tested.
+  Options.AllowOverlappingLoads =
+  (ST->enableUnalignedScalarMem() || ST->enableUnalignedScalarMem());
+  Options.MaxNumLoads = TLI->getMaxExpandSizeMemcmp(OptSize);
+  Options.NumLoadsPerBlock = Options.MaxNumLoads;
+  if (ST->is64Bit())
+Options.LoadSizes.push_back(8);
+  llvm::append_range(Options.LoadSizes, ArrayRef({4, 2, 1}));
+  Options.AllowedTailExpansions = {3, 5, 6};
+  return Options;
+}
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h 
b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
index 763b89bfec0a66..ee9bed09df97f3 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
@@ -404,6 +404,9 @@ class RISCVTTIImpl : public BasicTTIImplBase {
   shouldConsiderAddressTypePromotion(const Instruction &I,
  bool &AllowPromotionWithoutCommonHeader);
   std::optional getMinPageSize() const { return 4096; }
+
+  TTI::MemCmpExpansionOptions enableMemCmpExpansion(bool OptSize,
+bool IsZeroCmp) const;
 };
 
 } // end namespace llvm
diff --git a/llvm/test/CodeGen/RISCV/memcmp.ll 
b/llvm/test/CodeGen/RISCV/memcmp.ll
new file mode 100644
index 00..652cd02e2c750a
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/memcmp.ll
@@ -0,0 +1,932 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 5
+; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 -O2 | FileCheck %s 
--check-prefix=CHECK-ALIGNED-RV32
+; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 -O2 | FileCheck %s 
--check-prefix=CHECK-ALIGNED-RV64
+; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 
-mattr=+unaligned-scalar-mem -O2 \
+; RUN:   | FileCheck %s --check-prefix=CHECK-UNALIGNED-RV32
+; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 
-mattr=+unaligned-scalar-mem -O2 \
+; RUN:   | FileCheck %s --check-prefix=CHECK-UNALIGNED-RV64
+
+declare i32 @bcmp(i8*, i8*, iXLen) nounwind readonly
+declare i32 @memcmp(i8*, i8*, iXLen) nounwind readonly
+
+define i1 @bcmp_size_15(i8* %s1, i8* %s2) {
+; CHECK-ALIGNED-RV32-LABEL: bcmp_size_15:
+; CHECK-ALIGNED-RV32:   # %bb.0: # %entry
+; CHECK-ALIGNED-RV32-NEXT:lbu a2, 1(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 0(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 2(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 3(a0)
+; CHECK-ALIGNED-RV32-NEXT:slli a2, a2, 8
+; CHECK-ALIGNED-RV32-NEXT:or a2, a2, a3
+; CHECK-ALIGNED-RV32-NEXT:slli a4, a4, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a5, a4
+; CHECK-ALIGNED-RV32-NEXT:or a2, a4, a2
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 1(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 0(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 2(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a6, 3(a1)
+; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 8
+; CHECK-ALIGNED-RV32-NEXT:or a3, a3, a4
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a6, a5
+; CHECK-ALIGNED-RV32-NEXT:or a3, a4, a3
+; CHECK-ALIGNED-RV32-NEXT:xor a2, a2, a3
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 5(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 4(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 6(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a6, 7(a0)
+; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 8
+; CHECK-ALIGNED-RV32-NEXT:or a3, a3, a4
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a6, a5
+; CHECK-ALIGNED-RV32-NEXT:or a3, a4, a3
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 5(a1)
+; CHECK-ALIGNED-RV32-NEXT

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-10-10 Thread Pengcheng Wang via llvm-branch-commits



@@ -1144,42 +2872,116 @@ entry:
 define i32 @memcmp_size_4(ptr %s1, ptr %s2) nounwind {
 ; CHECK-ALIGNED-RV32-LABEL: memcmp_size_4:
 ; CHECK-ALIGNED-RV32:   # %bb.0: # %entry
-; CHECK-ALIGNED-RV32-NEXT:addi sp, sp, -16
-; CHECK-ALIGNED-RV32-NEXT:sw ra, 12(sp) # 4-byte Folded Spill
-; CHECK-ALIGNED-RV32-NEXT:li a2, 4
-; CHECK-ALIGNED-RV32-NEXT:call memcmp
-; CHECK-ALIGNED-RV32-NEXT:lw ra, 12(sp) # 4-byte Folded Reload
-; CHECK-ALIGNED-RV32-NEXT:addi sp, sp, 16
+; CHECK-ALIGNED-RV32-NEXT:lbu a2, 0(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 1(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 3(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a0, 2(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 0(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a6, 1(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a7, 3(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a1, 2(a1)
+; CHECK-ALIGNED-RV32-NEXT:slli a0, a0, 8
+; CHECK-ALIGNED-RV32-NEXT:or a0, a0, a4
+; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a2, a2, 24
+; CHECK-ALIGNED-RV32-NEXT:or a2, a2, a3
+; CHECK-ALIGNED-RV32-NEXT:or a0, a2, a0
+; CHECK-ALIGNED-RV32-NEXT:slli a1, a1, 8
+; CHECK-ALIGNED-RV32-NEXT:or a1, a1, a7
+; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 24
+; CHECK-ALIGNED-RV32-NEXT:or a2, a5, a6
+; CHECK-ALIGNED-RV32-NEXT:or a1, a2, a1
+; CHECK-ALIGNED-RV32-NEXT:sltu a2, a1, a0
+; CHECK-ALIGNED-RV32-NEXT:sltu a0, a0, a1
+; CHECK-ALIGNED-RV32-NEXT:sub a0, a2, a0
 ; CHECK-ALIGNED-RV32-NEXT:ret
 ;
 ; CHECK-ALIGNED-RV64-LABEL: memcmp_size_4:
 ; CHECK-ALIGNED-RV64:   # %bb.0: # %entry
-; CHECK-ALIGNED-RV64-NEXT:addi sp, sp, -16
-; CHECK-ALIGNED-RV64-NEXT:sd ra, 8(sp) # 8-byte Folded Spill
-; CHECK-ALIGNED-RV64-NEXT:li a2, 4
-; CHECK-ALIGNED-RV64-NEXT:call memcmp
-; CHECK-ALIGNED-RV64-NEXT:ld ra, 8(sp) # 8-byte Folded Reload
-; CHECK-ALIGNED-RV64-NEXT:addi sp, sp, 16
+; CHECK-ALIGNED-RV64-NEXT:lbu a2, 0(a0)
+; CHECK-ALIGNED-RV64-NEXT:lbu a3, 1(a0)
+; CHECK-ALIGNED-RV64-NEXT:lbu a4, 2(a0)
+; CHECK-ALIGNED-RV64-NEXT:lb a0, 3(a0)
+; CHECK-ALIGNED-RV64-NEXT:lbu a5, 0(a1)
+; CHECK-ALIGNED-RV64-NEXT:lbu a6, 1(a1)
+; CHECK-ALIGNED-RV64-NEXT:lbu a7, 2(a1)
+; CHECK-ALIGNED-RV64-NEXT:lb a1, 3(a1)
+; CHECK-ALIGNED-RV64-NEXT:andi a0, a0, 255
+; CHECK-ALIGNED-RV64-NEXT:slli a4, a4, 8
+; CHECK-ALIGNED-RV64-NEXT:or a0, a4, a0
+; CHECK-ALIGNED-RV64-NEXT:slli a3, a3, 16
+; CHECK-ALIGNED-RV64-NEXT:slliw a2, a2, 24
+; CHECK-ALIGNED-RV64-NEXT:or a2, a2, a3
+; CHECK-ALIGNED-RV64-NEXT:or a0, a2, a0
+; CHECK-ALIGNED-RV64-NEXT:andi a1, a1, 255
+; CHECK-ALIGNED-RV64-NEXT:slli a7, a7, 8
+; CHECK-ALIGNED-RV64-NEXT:or a1, a7, a1
+; CHECK-ALIGNED-RV64-NEXT:slli a6, a6, 16
+; CHECK-ALIGNED-RV64-NEXT:slliw a2, a5, 24
+; CHECK-ALIGNED-RV64-NEXT:or a2, a2, a6
+; CHECK-ALIGNED-RV64-NEXT:or a1, a2, a1
+; CHECK-ALIGNED-RV64-NEXT:sltu a2, a1, a0
+; CHECK-ALIGNED-RV64-NEXT:sltu a0, a0, a1
+; CHECK-ALIGNED-RV64-NEXT:sub a0, a2, a0
 ; CHECK-ALIGNED-RV64-NEXT:ret
 ;
 ; CHECK-UNALIGNED-RV32-LABEL: memcmp_size_4:
 ; CHECK-UNALIGNED-RV32:   # %bb.0: # %entry
-; CHECK-UNALIGNED-RV32-NEXT:addi sp, sp, -16
-; CHECK-UNALIGNED-RV32-NEXT:sw ra, 12(sp) # 4-byte Folded Spill
-; CHECK-UNALIGNED-RV32-NEXT:li a2, 4
-; CHECK-UNALIGNED-RV32-NEXT:call memcmp
-; CHECK-UNALIGNED-RV32-NEXT:lw ra, 12(sp) # 4-byte Folded Reload
-; CHECK-UNALIGNED-RV32-NEXT:addi sp, sp, 16
+; CHECK-UNALIGNED-RV32-NEXT:lw a0, 0(a0)

wangpc-pp wrote:

Done.
It seems we can benefit from not only `rev8` instructions but also `pack` 
instructions (for forming large integers).
But I don't think we should disable the expansion when Zbb/Zbkb don't exist. If 
these extensions are not supported and we don't expand memcmp, we would still 
execute these instructions in glibc, right?

https://github.com/llvm/llvm-project/pull/107548
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-10-10 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp edited 
https://github.com/llvm/llvm-project/pull/107548
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-10-10 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/107548

>From f21cfcfc90330ee3856746b6315a81a00313b0e0 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Fri, 6 Sep 2024 17:20:51 +0800
Subject: [PATCH 1/5] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20in?=
 =?UTF-8?q?itial=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.6-beta.1
---
 .../Target/RISCV/RISCVTargetTransformInfo.cpp |  15 +
 .../Target/RISCV/RISCVTargetTransformInfo.h   |   3 +
 llvm/test/CodeGen/RISCV/memcmp.ll | 932 ++
 3 files changed, 950 insertions(+)
 create mode 100644 llvm/test/CodeGen/RISCV/memcmp.ll

diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index e809e15eacf696..ad532aadc83266 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -2113,3 +2113,18 @@ bool RISCVTTIImpl::shouldConsiderAddressTypePromotion(
   }
   return Considerable;
 }
+
+RISCVTTIImpl::TTI::MemCmpExpansionOptions
+RISCVTTIImpl::enableMemCmpExpansion(bool OptSize, bool IsZeroCmp) const {
+  TTI::MemCmpExpansionOptions Options;
+  // FIXME: Vector haven't been tested.
+  Options.AllowOverlappingLoads =
+  (ST->enableUnalignedScalarMem() || ST->enableUnalignedScalarMem());
+  Options.MaxNumLoads = TLI->getMaxExpandSizeMemcmp(OptSize);
+  Options.NumLoadsPerBlock = Options.MaxNumLoads;
+  if (ST->is64Bit())
+Options.LoadSizes.push_back(8);
+  llvm::append_range(Options.LoadSizes, ArrayRef({4, 2, 1}));
+  Options.AllowedTailExpansions = {3, 5, 6};
+  return Options;
+}
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h 
b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
index 763b89bfec0a66..ee9bed09df97f3 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
@@ -404,6 +404,9 @@ class RISCVTTIImpl : public BasicTTIImplBase {
   shouldConsiderAddressTypePromotion(const Instruction &I,
  bool &AllowPromotionWithoutCommonHeader);
   std::optional getMinPageSize() const { return 4096; }
+
+  TTI::MemCmpExpansionOptions enableMemCmpExpansion(bool OptSize,
+bool IsZeroCmp) const;
 };
 
 } // end namespace llvm
diff --git a/llvm/test/CodeGen/RISCV/memcmp.ll 
b/llvm/test/CodeGen/RISCV/memcmp.ll
new file mode 100644
index 00..652cd02e2c750a
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/memcmp.ll
@@ -0,0 +1,932 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 5
+; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 -O2 | FileCheck %s 
--check-prefix=CHECK-ALIGNED-RV32
+; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 -O2 | FileCheck %s 
--check-prefix=CHECK-ALIGNED-RV64
+; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 
-mattr=+unaligned-scalar-mem -O2 \
+; RUN:   | FileCheck %s --check-prefix=CHECK-UNALIGNED-RV32
+; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 
-mattr=+unaligned-scalar-mem -O2 \
+; RUN:   | FileCheck %s --check-prefix=CHECK-UNALIGNED-RV64
+
+declare i32 @bcmp(i8*, i8*, iXLen) nounwind readonly
+declare i32 @memcmp(i8*, i8*, iXLen) nounwind readonly
+
+define i1 @bcmp_size_15(i8* %s1, i8* %s2) {
+; CHECK-ALIGNED-RV32-LABEL: bcmp_size_15:
+; CHECK-ALIGNED-RV32:   # %bb.0: # %entry
+; CHECK-ALIGNED-RV32-NEXT:lbu a2, 1(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 0(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 2(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 3(a0)
+; CHECK-ALIGNED-RV32-NEXT:slli a2, a2, 8
+; CHECK-ALIGNED-RV32-NEXT:or a2, a2, a3
+; CHECK-ALIGNED-RV32-NEXT:slli a4, a4, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a5, a4
+; CHECK-ALIGNED-RV32-NEXT:or a2, a4, a2
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 1(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 0(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 2(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a6, 3(a1)
+; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 8
+; CHECK-ALIGNED-RV32-NEXT:or a3, a3, a4
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a6, a5
+; CHECK-ALIGNED-RV32-NEXT:or a3, a4, a3
+; CHECK-ALIGNED-RV32-NEXT:xor a2, a2, a3
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 5(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 4(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 6(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a6, 7(a0)
+; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 8
+; CHECK-ALIGNED-RV32-NEXT:or a3, a3, a4
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a6, a5
+; CHECK-ALIGNED-RV32-NEXT:or a3, a4, a3
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 5(a1)
+; CHECK-ALIGNED-RV32-NEXT

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-10-10 Thread Pengcheng Wang via llvm-branch-commits



@@ -112,42 +104,46 @@ entry:
 define i32 @bcmp_size_2(ptr %s1, ptr %s2) nounwind optsize {
 ; CHECK-ALIGNED-RV32-LABEL: bcmp_size_2:
 ; CHECK-ALIGNED-RV32:   # %bb.0: # %entry
-; CHECK-ALIGNED-RV32-NEXT:addi sp, sp, -16
-; CHECK-ALIGNED-RV32-NEXT:sw ra, 12(sp) # 4-byte Folded Spill
-; CHECK-ALIGNED-RV32-NEXT:li a2, 2
-; CHECK-ALIGNED-RV32-NEXT:call bcmp
-; CHECK-ALIGNED-RV32-NEXT:lw ra, 12(sp) # 4-byte Folded Reload
-; CHECK-ALIGNED-RV32-NEXT:addi sp, sp, 16
+; CHECK-ALIGNED-RV32-NEXT:lbu a2, 1(a0)

wangpc-pp wrote:

It costs the same number of instructions?

https://github.com/llvm/llvm-project/pull/107548
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

1 2 >

1 - 100 of 196 matches

Mail list logo