================ @@ -80,8 +85,10 @@ class NVPTXSubtarget : public NVPTXGenSubtargetInfo { bool allowFP16Math() const; bool hasMaskOperator() const { return PTXVersion >= 71; } bool hasNoReturn() const { return SmVersion >= 30 && PTXVersion >= 64; } - unsigned int getSmVersion() const { return SmVersion; } + unsigned int getSmVersion() const { return FullSmVersion / 10; } + unsigned int getFullSmVersion() const { return FullSmVersion; } std::string getTargetName() const { return TargetName; } + bool isSm90a() const { return getFullSmVersion() == 901; } ---------------- Artem-B wrote:
According to [CUDA docs](docs.nvidia.com/cuda/parallel-thread-execution/index.html?highlight=sm_90a#ptx-module-directives-target) > Target architectures with suffix “a”, such as sm_90a, include > architecture-accelerated features that are supported on the specified > architecture only, hence such targets do not follow the onion layer model. > Therefore, PTX code generated for such targets cannot be run on later > generation devices. Architecture-accelerated features can only be used with > targets that support these features. It's not clear where they are going with this approach. I can make it a more generic `int hasAAFeatures() { return FullSmVersion % 10; }` if that's what you're looking for. https://github.com/llvm/llvm-project/pull/74895 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits