llvmbot wrote:

<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-clang-driver
@llvm/pr-subscribers-clang

@llvm/pr-subscribers-backend-risc-v

Author: Min-Yih Hsu (mshockwave)

<details>
<summary>Changes</summary>

Following up the discussions in the [RISC-V LLVM Syncup 
Meeting](https://docs.google.com/document/d/1G3ocHm2zE6AYTS2N3_3w2UxFnSEyKkcF57siLWe-NVs/edit?tab=t.0#heading=h.1jq17mylab94)
 on Oct 9, I'm summarizing in this PR the related backgrounds as well as 
concerns folks have raised during the meeting regarding the future direction of 
passing tuning features from the Clang driver. Feel free to use this PR as a 
discussion thread on this topic.

### More flexible RISC-V scheduling models
The original problem that motivated this whole body of work was that for a long 
time, if we wanted to share a single scheduling model among multiple CPUs that 
only differ in a small number of tuning features (e.g. whether it has fast 
vrgather), the only way was to create a base TableGen class that is 
subsequently parameterized and instantiated into a scheduling model for _each_ 
combination of the said tuning features. This TableGen-only solution doesn't 
scale in a sense that the number of scheduling models will explode upon adding 
more tuning features in the future.

I tried to tackle this problem by allowing scheduling model to be configured by 
subtargetfearue, via a new SchedPredicate called `FeatureSchedPredicate` (#<!-- 
-->161888 ). Such that we can create multiple CPU / tune CPU with only a 
_single_ scheduling model:

```
def Foo : RISCVTuneProcessorModel&lt;"foo", FooModel&gt;;
def FooWithFastVRGather : RISCVTuneProcessorModel&lt;"foo-fast-vrgather", 
FooModel, [TuneFastVRGather]&gt;;
```

The problem is, now CPU / tune CPU -- which maps to `-mcpu` and `-mtune` in the 
Clang driver, respectively -- become the one that doesn't scale, as we have to 
create a new `-mcpu` / `-mtune` for each combination of features &amp; 
scheduling model.

### Ways of passing tuning features at this moment
The problem can boil down into finding a proper way to pass tuning features 
from the Clang driver (we're only discussing driver here as you can always pass 
feature to the frontend via `-target-feature`). As a starter, it's important to 
know what we're doing at this moment, which can be categorized into three ways:
  - Retrieve tuning features from a CPU name. We currently encoded two 
unaligned-memory-related subtarget features directly into the `RISCVCPUInfo` 
data structure generated by RISC-V's TargetParser. This data structure is 
queried by driver which generates the corresponding `-target-feature` upon 
seeing a matching `-mcpu`.
  - Explicitly translate some of the driver flags. Flags that fall into 
category include `-ffixed-&lt;register name&gt;` and `-mrelax`.
  - Similar to the previous one, driver flag declarations that are included 
into the `m_riscv_Features_Group` will be automatically translated into the 
corresponding feature names based on their driver flag names. Currently only 
`-msave-restore` / `-mnosave-restore` use this.

It is also worth mentioning that RVI is considering adding _[Software 
Optimization Guidance 
Options](https://riscv.atlassian.net/wiki/spaces/TOXX/pages/585400366/Software+Optimization+Guidance+Options+Fast+Track+Approval+Request)_
 -- a concept similar to a RISC-V extension (whose name starts with "O") but 
only provides uArch/implementation hints to the compiler rather than 
representing any architecture level contracts. Despite being closely related to 
what we're discussing here, these options/extensions will go into the _march_ 
string.

### Candidate solutions
We have roughly two candidate solutions on the table now:

#### New syntax for `-mtune`
Introducing an alternative syntax to `-mtune` in addition to the current one: 
`-mtune=&lt;tune cpu name&gt;:+x,-y,+z...` where `+x,-y,+z` is the feature 
string that will be converted into individual `-target-feature` flags. Code in 
this PR contains a prototype of this approach.

Pros:
  - Scale better when there is a large number of tuning features in the future 
(no matter how many of them it'll always be a single flag)

Cons:
  - Without any special handling, subtracting features with LLVM's current 
feature string design can be tricky (e.g. `-foo,+foo` will still give you `foo` 
but not when you write `+foo,-foo`)
    - Alternative approach: we can forbid feature subtraction in the new 
`-mtune` string.
  - It's more difficult to handle or reason about implied features or just 
interactions between different features. ARM/AArch64 ran into a similar 
problems before with a similar mechanism of using "+feature" in their 
march/mcpu string.

#### Toggle tuning features with the existing `-mfoo` mechanism
Feature `foo` will be controlled by `-mfoo` &amp; `-mno-foo`.

Pros:
  - Adding &amp; subtracting features would be less of a problem compared to 
the previous `-mtune` approach, because the mechanism of `-mfoo` &amp; 
`-mno-foo` is well defined. 
  - Hide the implementation at the driver level, including populating the 
implied features or even changing the actual subtarget feature names without 
breaking the user interface (i.e. driver).

Cons:
  - Basically the opposite of the advantage from the previous `-mtune` 
approach: the number of `-m` flags will increase really fast (we need at least 
two flags per subtarget features) upon adding more tuning features in the 
future.

There is another challenge we have to address in both options: GCC 
compatibility. We have to talk with the GCC folks on this topic as we almost 
certain want a consistent interface and behavior across these two compilers.

I hope I captured everything we discussed in the meeting. Feel free to point 
out things that I missed.

---
Full diff: https://github.com/llvm/llvm-project/pull/162716.diff


4 Files Affected:

- (modified) clang/lib/Driver/ToolChains/Arch/RISCV.cpp (+28) 
- (modified) clang/lib/Driver/ToolChains/Arch/RISCV.h (+3) 
- (modified) clang/lib/Driver/ToolChains/Clang.cpp (+1-4) 
- (modified) clang/test/Driver/riscv-cpus.c (+5) 


``````````diff
diff --git a/clang/lib/Driver/ToolChains/Arch/RISCV.cpp 
b/clang/lib/Driver/ToolChains/Arch/RISCV.cpp
index f2e79e71f93d4..f51e3ea6bfe73 100644
--- a/clang/lib/Driver/ToolChains/Arch/RISCV.cpp
+++ b/clang/lib/Driver/ToolChains/Arch/RISCV.cpp
@@ -367,3 +367,31 @@ std::string riscv::getRISCVTargetCPU(const 
llvm::opt::ArgList &Args,
 
   return Triple.isRISCV64() ? "generic-rv64" : "generic-rv32";
 }
+
+void riscv::addMtuneWithFeatures(const Arg *A, const ArgList &Args,
+                                 ArgStringList &CmdArgs) {
+  // format: -mtune=<tune cpu>[:+a,-b,+c...]
+  StringRef MTune(A->getValue());
+  size_t ColonPos = MTune.find(':');
+  bool HasColon = ColonPos != StringRef::npos;
+
+  StringRef TuneCPU = MTune.take_front(ColonPos);
+  if (TuneCPU == "native")
+    CmdArgs.push_back(Args.MakeArgString(llvm::sys::getHostCPUName()));
+  else
+    CmdArgs.push_back(HasColon ? Args.MakeArgString(TuneCPU) : A->getValue());
+
+  StringRef FeatureStr = HasColon ? MTune.drop_front(ColonPos + 1) : "";
+  // TODO: Emit error
+  assert(!HasColon || !FeatureStr.empty());
+  if (FeatureStr.empty())
+    return;
+  StringRef Feature;
+  do {
+    std::tie(Feature, FeatureStr) = FeatureStr.split(',');
+    // TODO: Emit error
+    assert(Feature.starts_with('-') || Feature.starts_with('+'));
+    CmdArgs.push_back("-target-feature");
+    CmdArgs.push_back(Args.MakeArgString(Feature));
+  } while (!FeatureStr.empty());
+}
diff --git a/clang/lib/Driver/ToolChains/Arch/RISCV.h 
b/clang/lib/Driver/ToolChains/Arch/RISCV.h
index 388786b9c4c1f..3e43430051cca 100644
--- a/clang/lib/Driver/ToolChains/Arch/RISCV.h
+++ b/clang/lib/Driver/ToolChains/Arch/RISCV.h
@@ -28,6 +28,9 @@ std::string getRISCVArch(const llvm::opt::ArgList &Args,
                          const llvm::Triple &Triple);
 std::string getRISCVTargetCPU(const llvm::opt::ArgList &Args,
                               const llvm::Triple &Triple);
+void addMtuneWithFeatures(const llvm::opt::Arg *A,
+                          const llvm::opt::ArgList &Args,
+                          llvm::opt::ArgStringList &CmdArgs);
 } // end namespace riscv
 } // namespace tools
 } // end namespace driver
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index d326a81feb762..e39a8d4996089 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -2046,10 +2046,7 @@ void Clang::AddRISCVTargetArgs(const ArgList &Args,
 
   if (const Arg *A = Args.getLastArg(options::OPT_mtune_EQ)) {
     CmdArgs.push_back("-tune-cpu");
-    if (strcmp(A->getValue(), "native") == 0)
-      CmdArgs.push_back(Args.MakeArgString(llvm::sys::getHostCPUName()));
-    else
-      CmdArgs.push_back(A->getValue());
+    riscv::addMtuneWithFeatures(A, Args, CmdArgs);
   }
 
   // Handle -mrvv-vector-bits=<bits>
diff --git a/clang/test/Driver/riscv-cpus.c b/clang/test/Driver/riscv-cpus.c
index 5d5fdd72baedb..806c42bdad374 100644
--- a/clang/test/Driver/riscv-cpus.c
+++ b/clang/test/Driver/riscv-cpus.c
@@ -305,6 +305,11 @@
 // RUN: %clang --target=riscv64 -### -c %s 2>&1 -mtune=native | FileCheck 
-check-prefix=MTUNE-NATIVE %s
 // MTUNE-NATIVE-NOT: "-tune-cpu" "native"
 
+// Checking `-mtune=<tune cpu>:+x,-y,+z...`
+// RUN: %clang --target=riscv64 -### -c %s 2>&1 -mtune=sifive-e20:+xyz,-abc | 
FileCheck -check-prefix=MTUNE-WITH-FEATURES %s
+// MTUNE-WITH-FEATURES: "-tune-cpu" "sifive-e20"
+// MTUNE-WITH-FEATURES: "-target-feature" "+xyz" "-target-feature" "-abc"
+
 // -mcpu with default -march
 // RUN: %clang --target=riscv64 -### -c %s 2>&1 -mcpu=sifive-e20 | FileCheck 
-check-prefix=MCPU-SIFIVE-E20 %s
 // MCPU-SIFIVE-E20: "-nostdsysteminc" "-target-cpu" "sifive-e20"

``````````

</details>


https://github.com/llvm/llvm-project/pull/162716
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to