[PATCH] D31210: [AMDGPU] Add new address space mapping

2017-03-22 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added a comment. I also do not exactly like names "old" and "new". This implies we are going to switch to "new" permanently and doing transition. That is not clear yet, however. https://reviews.llvm.org/D31210 ___ cfe-commits mailing

[PATCH] D31210: [AMDGPU] Add new address space mapping

2017-03-22 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added a comment. In https://reviews.llvm.org/D31210#707842, @yaxunl wrote: > In https://reviews.llvm.org/D31210#707832, @rampitec wrote: > > > I also do not exactly like names "old" and "new". This implies we are going > > to switch to "new" permanently and doing transition. That is

[PATCH] D31210: [AMDGPU] Add new address space mapping

2017-03-21 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added a comment. I'm concerned about the default address space to be 64 bit. It would move alloca into generic address space effectively making private address to be 64 bit. This may have very undesirable performance implications, like address arithmetic can become expensive 64 bit

[PATCH] D37386: [AMDGPU] Implement infrastructure to set options in AMDGPUToolChain

2017-09-04 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. LGTM https://reviews.llvm.org/D37386 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[PATCH] D37386: [AMDGPU] Implement infrastructure to set options in AMDGPUToolChain

2017-09-01 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: lib/Driver/ToolChains/AMDGPU.h:44 +private: + const std::map OptionsDefault = { + {options::OPT_O, "3"}, Is it really needed to create map in the header?

[PATCH] D50984: AMDGPU: Move target code into TargetParser

2018-08-21 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. LGTM https://reviews.llvm.org/D50984 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[PATCH] D56525: [AMDGPU] Separate feature dot-insts

2019-01-09 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec created this revision. rampitec added reviewers: b-sumner, kzhuravl, msearles. Herald added subscribers: cfe-commits, t-tye, tpr, dstuttard, yaxunl, nhaehnle, wdng, jvesely. clang part Repository: rC Clang https://reviews.llvm.org/D56525 Files:

[PATCH] D56525: [AMDGPU] Separate feature dot-insts

2019-01-09 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL350794: [AMDGPU] Separate feature dot-insts (authored by rampitec, committed by ). Changed prior to commit: https://reviews.llvm.org/D56525?vs=180969=180991#toc Repository: rL LLVM CHANGES SINCE

[PATCH] D58847: AMDGPU: Fix the mapping of sub group sync scope

2019-03-01 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D58847/new/ https://reviews.llvm.org/D58847 ___ cfe-commits mailing list

[PATCH] D59494: AMDGPU: Add support for cross address space synchronization scopes

2019-03-18 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: lib/CodeGen/TargetInfo.cpp:7973 + if (Ordering != llvm::AtomicOrdering::SequentiallyConsistent) { +if (Scope != SyncScope::OpenCLAllSVMDevices) + Name = Twine(Twine(Name) + Twine("-")).str(); if

[PATCH] D59494: AMDGPU: Add support for cross address space synchronization scopes (clang)

2019-03-18 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: lib/CodeGen/TargetInfo.cpp:7976 + +Name = Twine(Twine(Name) + Twine("one-as")).str(); + } b-sumner wrote: > kzhuravl wrote: > > rampitec wrote: > > > I think subgroup is in the single address space even if

[PATCH] D59494: AMDGPU: Add support for cross address space synchronization scopes (clang)

2019-03-18 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D59494/new/ https://reviews.llvm.org/D59494 ___ cfe-commits mailing list

[PATCH] D57349: AMDGPU: Add ds append/consume builtins

2019-01-28 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D57349/new/ https://reviews.llvm.org/D57349 ___ cfe-commits mailing list

[PATCH] D63366: AMDGPU: Add GWS instruction builtins

2019-06-14 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D63366/new/ https://reviews.llvm.org/D63366 ___ cfe-commits mailing list

[PATCH] D63578: AMDGPU: Add DS GWS sema builtins

2019-06-20 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D63578/new/ https://reviews.llvm.org/D63578 ___ cfe-commits mailing list

[PATCH] D63649: AMDGPU: Fix target builtins for gfx10

2019-06-21 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D63649/new/ https://reviews.llvm.org/D63649 ___ cfe-commits mailing list

[PATCH] D61875: [AMDGPU] gfx1010 clang target

2019-05-13 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rC360634: [AMDGPU] gfx1010 clang target (authored by rampitec, committed by ). Herald added a project: clang. Herald added a subscriber: cfe-commits. Repository: rC Clang CHANGES SINCE LAST ACTION

[PATCH] D61112: AMDGPU: Enable _Float16

2019-04-25 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D61112/new/ https://reviews.llvm.org/D61112 ___ cfe-commits mailing list

[PATCH] D64828: AMDGPU: Add some missing builtins

2019-07-16 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D64828/new/ https://reviews.llvm.org/D64828 ___ cfe-commits mailing list

[PATCH] D66198: AMDGPU: Add builtins for is_local/is_private

2019-08-14 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66198/new/ https://reviews.llvm.org/D66198 ___ cfe-commits mailing list

[PATCH] D66198: AMDGPU: Add builtins for is_local/is_private

2019-08-14 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added a comment. Didn't you forget to update test/CodeGenOpenCL/amdgpu-features.cl? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D66198/new/ https://reviews.llvm.org/D66198 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[PATCH] D62739: AMDGPU: Always emit amdgpu-flat-work-group-size

2019-08-22 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: lib/CodeGen/TargetInfo.cpp:7885 +// By default, restrict the maximum size to 256. +F->addFnAttr("amdgpu-flat-work-group-size", "128,256"); } arsenm wrote: > yaxunl wrote: > > arsenm wrote: > > > b-sumner

[PATCH] D62739: AMDGPU: Always emit amdgpu-flat-work-group-size

2019-08-27 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D62739/new/ https://reviews.llvm.org/D62739 ___ cfe-commits mailing list

[PATCH] D65454: AMDGPU: Add missing builtin declarations

2019-07-30 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D65454/new/ https://reviews.llvm.org/D65454 ___ cfe-commits mailing list

[PATCH] D76076: [HIP] Mark kernels with uniform-work-group-size=true

2020-03-12 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D76076/new/ https://reviews.llvm.org/D76076 ___ cfe-commits mailing list

[PATCH] D77329: [AMDGPU] Allow AGPR in inline asm

2020-04-02 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: clang/test/CodeGenOpenCL/inline-asm-amdgcn.cl:16 + float reg_b; + float reg_c; + // CHECK: call <32 x float> asm "v_mfma_f32_32x32x1f32 $0, $1, $2, $3",

[PATCH] D77329: [AMDGPU] Allow AGPR in inline asm

2020-04-02 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. Thanks. Could you also update AMDGPUTargetInfo::GCCRegNames[] (in a separate change)? It is used in clobber constraints. JBTW, it does not support register tuples even for V and S now.

[PATCH] D79744: clang: Add address space to indirect abi info and use it for kernels

2020-05-11 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added a comment. Typo in commit message: "Previously, indirect arguments assumed assumed". CHANGES SINCE LAST ACTION https://reviews.llvm.org/D79744/new/ https://reviews.llvm.org/D79744 ___ cfe-commits mailing list

[PATCH] D76472: AMDGPU: Emit llvm.fshr for __builtin_amdgcn_alignbit

2020-03-23 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D76472/new/ https://reviews.llvm.org/D76472 ___ cfe-commits mailing list

[PATCH] D88916: [AMDGPU] Add gfx602, gfx705, gfx805 targets

2020-10-07 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D88916/new/ https://reviews.llvm.org/D88916

[PATCH] D89487: [AMDGPU] gfx1032 target

2020-10-16 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rGd1beb95d1241: [AMDGPU] gfx1032 target (authored by rampitec). Herald added a project: clang. Herald added a subscriber: cfe-commits. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION

[PATCH] D89487: [AMDGPU] gfx1032 target

2020-10-16 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: clang/test/Driver/amdgpu-macros.cl:216 // GFX1031-DAG: #define FP_FAST_FMA 1 +// GFX1032-DAG: #define FP_FAST_FMA 1 tra wrote: > This test could use some refactoring. > Individual macro checks could be collapsed to

[PATCH] D89487: [AMDGPU] gfx1032 target

2020-10-16 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec marked 3 inline comments as done. rampitec added inline comments. Comment at: llvm/docs/AMDGPUUsage.rst:280 names. + ``gfx1032`` ``amdgcn`` dGPU - xnack

[PATCH] D89487: [AMDGPU] gfx1032 target

2020-10-16 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec marked 3 inline comments as done. rampitec added inline comments. Comment at: llvm/lib/Support/TargetParser.cpp:66 // Don't bother listing the implicitly true features -constexpr GPUInfo AMDGCNGPUs[43] = { +constexpr GPUInfo AMDGCNGPUs[44] = { // Name

[PATCH] D89582: clang/AMDGPU: Apply workgroup related attributes to all functions

2020-10-16 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added a comment. In D89582#2335704 , @arsenm wrote: > In D89582#2335671 , @rampitec wrote: > >> In D89582#2335619 , @arsenm wrote: >> >>> In D89582#2335574

[PATCH] D89582: clang/AMDGPU: Apply workgroup related attributes to all functions

2020-10-16 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added a comment. In D89582#2335619 , @arsenm wrote: > In D89582#2335574 , @yaxunl wrote: > >> What if a device function is called by kernels with different work group >> sizes, will caller's work group

[PATCH] D87947: [AMDGPU] Make ds fp atomics overloadable

2020-09-23 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG59691dc8740c: [AMDGPU] Make ds fp atomics overloadable (authored by rampitec). Herald added a project: clang. Herald added a subscriber: cfe-commits. Repository: rG LLVM Github Monorepo CHANGES SINCE

[PATCH] D87972: [OldPM] Pass manager: run SROA after (simple) loop unrolling

2020-09-21 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added a comment. This is obviously LGTM from the AMDGPU BE point of view, we did it ourselves. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D87972/new/ https://reviews.llvm.org/D87972 ___

[PATCH] D85337: [AMDGPU] gfx1031 target

2020-08-06 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: llvm/tools/llvm-readobj/ELFDumper.cpp:1844 LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1030), + LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1031), LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_XNACK),

[PATCH] D85337: [AMDGPU] gfx1031 target

2020-08-10 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec marked 2 inline comments as done. rampitec added inline comments. Comment at: llvm/tools/llvm-readobj/ELFDumper.cpp:1844 LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1030), + LLVM_READOBJ_ENUM_ENT(ELF, EF_AMDGPU_MACH_AMDGCN_GFX1031),

[PATCH] D85337: [AMDGPU] gfx1031 target

2020-08-06 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rGea7d0e2996ec: [AMDGPU] gfx1031 target (authored by rampitec). Herald added a project: clang. Herald added a subscriber: cfe-commits. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION

[PATCH] D81886: [AMDGPU] Add gfx1030 target

2020-06-18 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec marked an inline comment as done. rampitec added inline comments. Comment at: llvm/docs/AMDGPUUsage.rst:266-267 names. + ``gfx1030`` ``amdgcn`` dGPU - xnack

[PATCH] D81959: [HIP] Enable -amdgpu-internalize-symbols

2020-06-16 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. LGTM, thanks! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D81959/new/ https://reviews.llvm.org/D81959 ___ cfe-commits mailing list

[PATCH] D92115: AMDGPU - Add diagnostic for compiling modules with AMD HSA OS type and GFX 6 arch

2020-11-25 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added a comment. You need to add a new test for this new error. Comment at: llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp:134 + if (isAmdHsaOS() && getGeneration() == AMDGPUSubtarget::SOUTHERN_ISLANDS) { +report_fatal_error("GFX6 (SI) ASICs does not support AMD HSA

[PATCH] D92115: AMDGPU - Add diagnostic for compiling modules with AMD HSA OS type and GFX 6 arch

2020-12-07 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp:62-72 +static AMDGPUSubtarget::Generation initializeGen(const Triple , + StringRef GPU) { + if (GPU.contains("generic")) { +return

[PATCH] D90447: [AMDGPU] Add gfx1033 target

2020-10-30 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added a comment. Missing changes to these files: clang/include/clang/Basic/Cuda.h clang/lib/Basic/Cuda.cpp clang/lib/Basic/Targets/NVPTX.cpp clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp llvm/lib/Object/ELFObjectFile.cpp llvm/test/Object/AMDGPU/elf-header-flags-mach.yaml

[PATCH] D90447: [AMDGPU] Add gfx1033 target

2020-10-30 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D90447/new/ https://reviews.llvm.org/D90447

[PATCH] D90809: [amdgpu] Add `llvm.amdgcn.endpgm` support.

2020-11-05 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: llvm/include/llvm/IR/IntrinsicsAMDGPU.td:1581 +def int_amdgcn_endpgm : GCCBuiltin<"__builtin_amdgcn_endpgm">, + Intrinsic<[], [], [IntrNoReturn, IntrNoMem, IntrHasSideEffects] +>; Mayby also IntrCold? Repository:

[PATCH] D90809: [amdgpu] Add `llvm.amdgcn.endpgm` support.

2020-11-05 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added a comment. In D90809#2376994 , @b-sumner wrote: > Should this also be IntrConvergent? Probably yes... This is control flow after all. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D90809/new/

[PATCH] D90809: [amdgpu] Add `llvm.amdgcn.endpgm` support.

2020-11-05 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. In D90809#2377221 , @hliao wrote: > In D90809#2377083 , @rampitec wrote: > >> In D90809#2376994

[PATCH] D90886: [AMDGPU] Simplify amdgpu-macros.cl test. NFC.

2020-11-06 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG4fcdfc4398bd: [AMDGPU] Simplify amdgpu-macros.cl test. NFC. (authored by rampitec). Herald added a project: clang. Herald added a subscriber: cfe-commits. Repository: rG LLVM Github Monorepo CHANGES

[PATCH] D103663: [AMDGPU] Add gfx1013 target

2021-06-07 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:4697 + if (!ST.hasGFX10_AEncoding()) { +DiagnosticInfoUnsupported BadIntrin(B.getMF().getFunction(), "intrinsic not supported on subtarget", +

[PATCH] D103663: [AMDGPU] Add gfx1013 target

2021-06-07 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll:4 +; RUN: llc -global-isel -march=amdgcn -mcpu=gfx1013 -verify-machineinstrs < %s | FileCheck -check-prefix=GCN %s +; RUN: llc -global-isel -march=amdgcn -mcpu=gfx1012

[PATCH] D104804: [AMDGPU] Add gfx1035 target

2021-06-23 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D104804/new/ https://reviews.llvm.org/D104804

[PATCH] D103663: [AMDGPU] Add gfx1013 target

2021-06-08 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:4700 +B.getMF().getFunction().getContext().diagnose(BadIntrin); +B.buildUndef(MI.getOperand(0)); +MI.eraseFromParent(); rampitec wrote: > Just return false

[PATCH] D103663: [AMDGPU] Add gfx1013 target

2021-06-08 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:4701 +B.getMF().getFunction().getContext().diagnose(BadIntrin); +B.buildUndef(MI.getOperand(0)); +MI.eraseFromParent(); You can just omit undef and erase.

[PATCH] D102306: Add gfx1034

2021-05-13 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D102306/new/ https://reviews.llvm.org/D102306 ___ cfe-commits mailing list

[PATCH] D102022: [AMDGPU] Expose __builtin_amdgcn_perm for v_perm_b32

2021-05-07 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rGc714d037857f: [AMDGPU] Expose __builtin_amdgcn_perm for v_perm_b32 (authored by rampitec). Herald added a project: clang. Herald added a subscriber:

[PATCH] D103663: [AMDGPU] Add gfx1013 target

2021-06-04 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll:4 +; RUN: llc -global-isel -march=amdgcn -mcpu=gfx1013 -verify-machineinstrs < %s | FileCheck -check-prefix=GCN %s +; RUN: llc -global-isel -march=amdgcn -mcpu=gfx1012

[PATCH] D103663: [AMDGPU] Add gfx1013 target

2021-06-04 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added a comment. You need to replace HasGFX10_BEncoding with HasGFX10_AEncoding in the BVH and IMAGE_MSAA_LOAD_X. You also need to update llvm.amdgcn.image.msaa.load.x.ll test to include gfx1013. Comment at: llvm/lib/Target/AMDGPU/AMDGPU.td:1106 [FeatureGFX10,

[PATCH] D81886: [AMDGPU] Add gfx1030 target

2021-06-25 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: llvm/lib/Target/AMDGPU/AMDGPU.td:1245 + +def HasDsSrc2Insts : Predicate<"!Subtarget->hasDsSrc2Insts()">, + AssemblerPredicate<(all_of FeatureDsSrc2Insts)>; foad wrote: > The `!` is obviously wrong in this definition,

[PATCH] D95733: [AMDGPU] Set s-memtime-inst feature from clang

2021-02-02 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG8e661d3d9c52: [AMDGPU] Set s-memtime-inst feature from clang (authored by rampitec). Herald added a project: clang. Herald added a subscriber: cfe-commits. Repository: rG LLVM Github Monorepo CHANGES

[PATCH] D96906: [AMDGPU] gfx90a support

2021-03-29 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec marked an inline comment as done. rampitec added inline comments. Comment at: llvm/lib/Target/AMDGPU/SIFoldOperands.cpp:100 bool tryFoldOMod(MachineInstr ); + bool tryFoldRegSeqence(MachineInstr ); + bool tryFoldLCSSAPhi(MachineInstr ); foad wrote:

[PATCH] D98717: [AMDGPU] Split dot2-insts feature

2021-03-16 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. LGTM. Thanks Jay! Comment at: llvm/lib/Target/AMDGPU/AMDGPU.td:511 +def FeatureDot7Insts : SubtargetFeature<"dot7-insts", + "HasDot7Insts", + "true",

[PATCH] D97069: [clang] SimpleMFlag helper in Options.td

2021-03-02 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG21280d35d652: [clang] SimpleMFlag helper in Options.td (authored by rampitec). Herald added a project: clang. Herald added a subscriber:

[PATCH] D97420: [AMDGPU] require s-memtime-inst for __builtin_amdgcn_s_memtime

2021-02-25 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG502b3bfc6a71: [AMDGPU] require s-memtime-inst for __builtin_amdgcn_s_memtime (authored by rampitec). Herald added a project: clang. Herald added a subscriber: cfe-commits. Repository: rG LLVM Github

[PATCH] D96906: [AMDGPU] gfx90a support

2021-02-23 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: llvm/lib/Target/AMDGPU/SIFormMemoryClauses.cpp:191-199 + MCRegister RepReg; + for (MCRegister R : *MRI->getRegClass(Reg)) { +if (!MRI->isReserved(R)) { + RepReg = R; + break; +} + } arsenm wrote: >

[PATCH] D96906: [AMDGPU] gfx90a support

2021-02-23 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: llvm/lib/Target/AMDGPU/SIFormMemoryClauses.cpp:191-199 + MCRegister RepReg; + for (MCRegister R : *MRI->getRegClass(Reg)) { +if (!MRI->isReserved(R)) { + RepReg = R; + break; +} + } rampitec wrote:

[PATCH] D97928: [AMDGPU] Restore the s_memtime instruction in gfx1030

2021-03-05 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. LGTM, thanks! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D97928/new/ https://reviews.llvm.org/D97928

[PATCH] D100072: [AMDGPU] Allow -amdgpu-unsafe-fp-atomics to ignore denorm mode

2021-04-09 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG189310a140fa: [AMDGPU] Allow -amdgpu-unsafe-fp-atomics to ignore denorm mode (authored by rampitec). Herald added a project: clang. Herald added a subscriber: cfe-commits. Repository: rG LLVM Github

[PATCH] D96906: [AMDGPU] gfx90a support

2021-02-18 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rGa8d9d50762c4: [AMDGPU] gfx90a support (authored by rampitec). Herald added a project: clang. Herald added a subscriber: cfe-commits. Changed prior to commit:

[PATCH] D96906: [AMDGPU] gfx90a support

2021-02-18 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added a comment. In D96906#2570086 , @tra wrote: > This is a pretty huge patch, with no details in the commit log. > > One hour between sending the patch out and landing it is not sufficient for > anyone to meaningfully > review the patch and

[PATCH] D96906: [AMDGPU] gfx90a support

2021-02-20 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: llvm/lib/Target/AMDGPU/SIFormMemoryClauses.cpp:191-199 + MCRegister RepReg; + for (MCRegister R : *MRI->getRegClass(Reg)) { +if (!MRI->isReserved(R)) { + RepReg = R; + break; +} + } rampitec wrote:

[PATCH] D96906: [AMDGPU] gfx90a support

2021-02-20 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: llvm/lib/Target/AMDGPU/SIFormMemoryClauses.cpp:191-199 + MCRegister RepReg; + for (MCRegister R : *MRI->getRegClass(Reg)) { +if (!MRI->isReserved(R)) { + RepReg = R; + break; +} + } arsenm wrote: >

[PATCH] D96906: [AMDGPU] gfx90a support

2021-02-20 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: llvm/lib/Target/AMDGPU/SIFormMemoryClauses.cpp:191-199 + MCRegister RepReg; + for (MCRegister R : *MRI->getRegClass(Reg)) { +if (!MRI->isReserved(R)) { + RepReg = R; + break; +} + } arsenm wrote: >

[PATCH] D96906: [AMDGPU] gfx90a support

2021-02-19 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: llvm/lib/Target/AMDGPU/SIFormMemoryClauses.cpp:191-199 + MCRegister RepReg; + for (MCRegister R : *MRI->getRegClass(Reg)) { +if (!MRI->isReserved(R)) { + RepReg = R; + break; +} + } arsenm wrote: >

[PATCH] D96906: [AMDGPU] gfx90a support

2021-02-19 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: clang/include/clang/Driver/Options.td:3097-3101 +def mtgsplit : Flag<["-"], "mtgsplit">, Group, + HelpText<"Enable threadgroup split execution mode (AMDGPU only)">; +def mno_tgsplit : Flag<["-"], "mno-tgsplit">, Group, +

[PATCH] D96906: [AMDGPU] gfx90a support

2021-02-19 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: clang/include/clang/Driver/Options.td:3097-3101 +def mtgsplit : Flag<["-"], "mtgsplit">, Group, + HelpText<"Enable threadgroup split execution mode (AMDGPU only)">; +def mno_tgsplit : Flag<["-"], "mno-tgsplit">, Group, +

[PATCH] D96906: [AMDGPU] gfx90a support

2021-02-19 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: clang/include/clang/Driver/Options.td:3097-3101 +def mtgsplit : Flag<["-"], "mtgsplit">, Group, + HelpText<"Enable threadgroup split execution mode (AMDGPU only)">; +def mno_tgsplit : Flag<["-"], "mno-tgsplit">, Group, +

[PATCH] D108150: [Remarks] [AMDGPU] Emit optimization remarks for atomics generating hardware instructions

2021-08-18 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:12194 - return (fpModeMatchesGlobalFPAtomicMode(RMW) || - RMW->getFunction() - ->getFnAttribute("amdgpu-unsafe-fp-atomics") -

[PATCH] D108150: [Remarks] [AMDGPU] Emit optimization remarks for atomics generating hardware instructions

2021-08-17 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: clang/test/CodeGenOpenCL/atomics-remarks-gfx90a.cl:9 +// RUN: %clang_cc1 -cl-std=CL2.0 -O0 -triple=amdgcn-amd-amdhsa -target-cpu gfx90a \ +// RUN: -Rpass=si-lower -munsafe-fp-atomics %s -S -o - 2>&1 | \ You are

[PATCH] D106891: [Remarks] Emit optimization remarks for atomics generating CAS loop

2021-08-15 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: clang/test/CodeGenOpenCL/atomics-remarks-gfx90a.cl:32 +// GFX90A-CAS: atomicrmw fadd float addrspace(1)* {{.*}} syncscope("workgroup-one-as") monotonic +float atomic_cas_system(__global atomic_float *d, float a) { + return

[PATCH] D106891: [Remarks] Emit optimization remarks for atomics generating CAS loop

2021-08-15 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: clang/test/CodeGenOpenCL/atomics-remarks-gfx90a.cl:33 +float atomic_cas(__global atomic_float *d, float a) { + return __opencl_atomic_fetch_add(d, a, memory_order_relaxed, memory_scope_work_group); +} Just combine

[PATCH] D106891: [Remarks] Emit optimization remarks for atomics generating CAS loop

2021-08-15 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec accepted this revision. rampitec added a comment. This revision is now accepted and ready to land. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D106891/new/ https://reviews.llvm.org/D106891

[PATCH] D108150: [Remarks] [AMDGPU] Emit optimization remarks for atomics generating hardware instructions

2021-08-17 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: clang/test/CodeGenOpenCL/atomics-remarks-gfx90a.cl:9 +// RUN: %clang_cc1 -cl-std=CL2.0 -O0 -triple=amdgcn-amd-amdhsa -target-cpu gfx90a \ +// RUN: -Rpass=si-lower -munsafe-fp-atomics %s -S -o - 2>&1 | \

[PATCH] D106891: [AMDGPU] [Remarks] Emit optimization remarks for atomics generating CAS loop

2021-08-13 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: llvm/lib/CodeGen/AtomicExpandPass.cpp:618 expandAtomicRMWToCmpXchg(AI, createCmpXchgInstFun); + Ctx.getSyncScopeNames(SSNs); + auto MemScope = SSNs[AI->getSyncScopeID()].empty() Only if SSNs.empty().

[PATCH] D106891: [AMDGPU] [Remarks] Emit optimization remarks for atomics generating CAS loop

2021-08-13 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: clang/test/CodeGenCUDA/fp-atomics-optremarks.cu:10 + +// GFX90A-CAS: A compare and swap loop was generated for an atomic operation at system memory scope +// GFX90A-CAS-LABEL: _Z14atomic_add_casPf gandhi21299 wrote: >

[PATCH] D106891: [Remarks] Emit optimization remarks for atomics generating CAS loop

2021-08-13 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added a comment. Please restore opencl test. Comment at: llvm/lib/CodeGen/AtomicExpandPass.cpp:622 +return OptimizationRemark(DEBUG_TYPE, "Passed", AI->getFunction()) + << "A compare and swap loop was generated for an " + <<

[PATCH] D106891: [AMDGPU] [Remarks] Emit optimization remarks for atomics generating CAS loop

2021-08-13 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: clang/test/CodeGenCUDA/fp-atomics-optremarks.cu:10 + +// GFX90A-CAS: A compare and swap loop was generated for an atomic operation at system memory scope +// GFX90A-CAS-LABEL: _Z14atomic_add_casPf gandhi21299 wrote: >

[PATCH] D106891: [AMDGPU] [Remarks] Emit optimization remarks for atomics generating CAS loop

2021-08-13 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added a comment. Please retitle it without AMDGPU and remove the changes to pass ORE to targets. It is not a part of this change, it is a part of the folloup target specific change. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION

[PATCH] D106891: [AMDGPU] [Remarks] Emit optimization remarks for atomics generating CAS loop

2021-08-13 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: clang/test/CodeGenCUDA/fp-atomics-optremarks.cu:10 + +// GFX90A-CAS: A compare and swap loop was generated for an atomic operation at system memory scope +// GFX90A-CAS-LABEL: _Z14atomic_add_casPf gandhi21299 wrote: >

[PATCH] D106891: [Remarks] Emit optimization remarks for atomics generating CAS loop

2021-08-16 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: llvm/lib/CodeGen/AtomicExpandPass.cpp:175 + ORE = std::make_unique(); auto = TPC->getTM(); gandhi21299 wrote: > rampitec wrote: > > Is there a reason to construct it upfront and not just use a local variable >

[PATCH] D106891: [Remarks] Emit optimization remarks for atomics generating CAS loop

2021-08-16 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: llvm/lib/CodeGen/AtomicExpandPass.cpp:175 + ORE = std::make_unique(); auto = TPC->getTM(); Is there a reason to construct it upfront and not just use a local variable only when needed? Like in

[PATCH] D106891: [Remarks] Emit optimization remarks for atomics generating CAS loop

2021-08-16 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec accepted this revision. rampitec added a comment. LGTM, but please wait for others too. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D106891/new/ https://reviews.llvm.org/D106891 ___

[PATCH] D108150: [Remarks] Emit optimization remarks for atomics generating hardware instructions

2021-08-16 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added a comment. - Add [AMDGPU] to the title. - Rebase on top of D106891 . - Add tests to atomics-remarks-gfx90a.ll as well, including LDS with matching and non-matching rounding mode. Comment at:

[PATCH] D108150: [Remarks] [AMDGPU] Emit optimization remarks for atomics generating hardware instructions

2021-08-18 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added inline comments. Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:12195 + if (!fpModeMatchesGlobalFPAtomicMode(RMW)) +return reportUnsafeHWInst(RMW, AtomicExpansionKind::None); gandhi21299 wrote: > rampitec wrote: > > rampitec

[PATCH] D108150: [Remarks] [AMDGPU] Emit optimization remarks for atomics generating hardware instructions

2021-08-18 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec requested changes to this revision. rampitec added a comment. This revision now requires changes to proceed. Logic is still wrong. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D108150/new/ https://reviews.llvm.org/D108150

[PATCH] D108150: [Remarks] [AMDGPU] Emit optimization remarks for atomics generating hardware instructions

2021-08-18 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added a comment. In D108150#2950458 , @gandhi21299 wrote: > @rampitec Which part of the logic is wrong? Still the same around LDS. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D108150/new/

[PATCH] D108150: [Remarks] [AMDGPU] Emit optimization remarks for atomics generating hardware instructions

2021-08-18 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec added a comment. In D108150#2950479 , @gandhi21299 wrote: > My understanding is that since we are reporting unsafe expansion into hw > instructions, `fpModeMatchesGlobalFPAtomicMode(RMW)` must be false to match > the logic. Please run

[PATCH] D106909: [clang] Add clang builtins support for gfx90a

2021-08-03 Thread Stanislav Mekhanoshin via Phabricator via cfe-commits
rampitec requested changes to this revision. rampitec added a comment. This revision now requires changes to proceed. Needs an IR test, a test for different supported targets, and a negative test for unsupported features. Comment at:

  1   2   >