[PATCH] D159118: [libc] Implement the 'clock()' function on the GPU

2023-08-29 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I think something needs to notice when clock_t is a different thing on the host and on the GPU for the offloading languages and complain, probably fatally. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D159118/new/

[PATCH] D158246: [amdgpu] WIP variadics

2023-08-18 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Name candidates - expand - lower - desugar - transform Lowering probably makes the most sense for the abi level apply to all functions, I like desugar to cover rewriting a subset of the graph Comment at: libc/config/gpu/entrypoints.txt:84-85

[PATCH] D158246: [amdgpu] WIP variadics

2023-08-18 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield removed subscribers: kristof.beyls, wangpc, jdoerfert. JonChesterfield added inline comments. Comment at: llvm/lib/CodeGen/ExpandVAIntrinsics.cpp:215-217 +auto alloced = Builder.Insert( +new AllocaInst(VarargsTy, DL.getAllocaAddrSpace(), nullptr, +

[PATCH] D158246: [amdgpu] WIP variadics

2023-08-18 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 551616. JonChesterfield added a comment. - Rename ExpandVAIntrinsics to DesugarVariadics Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D158246/new/ https://reviews.llvm.org/D158246 Files:

[PATCH] D158246: [amdgpu] WIP variadics

2023-08-18 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. @arsenm suggested va_start/va_end should be addrspace aware, similar idea to https://reviews.llvm.org/D15. Should be less invasive for the va intrinsics as they're not used so widely. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION

[PATCH] D158246: [amdgpu] WIP variadics

2023-08-17 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: llvm/lib/CodeGen/ExpandVAIntrinsics.cpp:159 +// conventions. Escape analysis on va_list values. +return false; + } Cut this from the WIP patch as the draft is noisy. Ran some tests through x64 with a

[PATCH] D158246: [amdgpu] WIP variadics

2023-08-17 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. The lowering pass is broadly right, missing a few edge cases. Comment at: libc/config/gpu/entrypoints.txt:88 +libc.src.stdio.vsnprintf libc.src.stdio.puts libc.src.stdio.fopen ^these try to build, but fail. I

[PATCH] D158246: [amdgpu] WIP variadics

2023-08-17 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield created this revision. Herald added subscribers: libc-commits, foad, kerbowa, hiraditya, tpr, dstuttard, yaxunl, jvesely, kzhuravl, arsenm. Herald added projects: libc-project, All. JonChesterfield requested review of this revision. Herald added subscribers: llvm-commits,

[PATCH] D157738: [OpenMP] Emit offloading entries for indirect target variables

2023-08-14 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Ok, I'm really sure this needs to be reflected in the type system. for some target foo gets to be larger than 8 bytes and we use a different calling convention for it. Otherwise however we carve this the type erasure is going to make unrelated calls acquire

[PATCH] D157738: [OpenMP] Emit offloading entries for indirect target variables

2023-08-14 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. If calling an indirect function pointer on the GPU requires a table lookup (keyed by host function addresses, which I didn't think we knew at GPU compile time), and we cannot distinguish indirect function pointers from function pointers, then this feature must

[PATCH] D157738: [OpenMP] Emit offloading entries for indirect target variables

2023-08-14 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. > calling device functions via their associated host pointer What does this mean? Defining a function foo such that the host and each individual target each have their own machine code for it, such that can be copied over to the target and then invoked to mean

[PATCH] D155986: [clang][AMDGPU]: Don't use byval for struct arguments in function ABI

2023-08-11 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Thanks! Happy to see function calls getting cheaper Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D155986/new/ https://reviews.llvm.org/D155986 ___ cfe-commits mailing

[PATCH] D156928: [Clang][AMDGPU] Fix handling of -mcode-object-version=none arg

2023-08-04 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D156928#4562412 , @yaxunl wrote: > Some control variables are per-module. Clang cannot emit control variables > that have different values for different modules. Intrinsics should work > since it can take an argument

[PATCH] D156928: [Clang][AMDGPU] Fix handling of -mcode-object-version=none arg

2023-08-04 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Or, the front end could define those objects directly, without importing IR files that define the objects with the content clang used to choose the object file. E.g. instead of the argument daz=off (spelled differently) finding a file called daz.off.ll that

[PATCH] D156928: [Clang][AMDGPU] Fix handling of -mcode-object-version=none arg

2023-08-04 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D156928#4561849 , @arsenm wrote: > In D156928#4561811 , > @JonChesterfield wrote: > >> What does code objects version= none mean? > > Handle any version So... That should be

[PATCH] D156928: [Clang][AMDGPU] Fix handling of -mcode-object-version=none arg

2023-08-04 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. What does code objects version= none mean? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D156928/new/ https://reviews.llvm.org/D156928 ___ cfe-commits mailing list

[PATCH] D155191: clang/HIP: Directly use f32 exp and log builtins

2023-07-13 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. Thanks! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D155191/new/ https://reviews.llvm.org/D155191 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[PATCH] D142569: [OpenMP] Introduce kernel environment

2023-07-12 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. How does a function find the corresponding kernel environment at runtime? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D142569/new/ https://reviews.llvm.org/D142569 ___

[PATCH] D154850: [libc] Remove GPU string functions incompatible with C++

2023-07-10 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Is this a const thing? If so I'd naively hope we can declare both in C++ mode and alias them, but I can believe that is detectably broken. How does glibc manage their hack?

[PATCH] D153725: [clang] Make amdgpu-arch tool work on Windows

2023-07-10 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. The test detection is an awkward compromise between people who want to run the GPU tests and people who don't, and reflects diverse hardware in use and variation on whether cuda / hsa are installed. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST

[PATCH] D153725: [clang] Make amdgpu-arch tool work on Windows

2023-07-10 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D153725#4484966 , @arsenm wrote: > In D153725#4484754 , > @JonChesterfield wrote: > >> - if you open the driver too many times at once it fails to open, so running >> a

[PATCH] D153725: [clang] Make amdgpu-arch tool work on Windows

2023-07-10 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D153725#4484747 , @arsenm wrote: > In D153725#4484711 , > @JonChesterfield wrote: > >> The right thing to do on Linux for this is to query the driver directly. >> That is,

[PATCH] D153725: [clang] Make amdgpu-arch tool work on Windows

2023-07-10 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/tools/amdgpu-arch/AMDGPUArch.cpp:50 +#else + return printGPUsByHSA(); +#endif yaxunl wrote: > jhuber6 wrote: > > arsenm wrote: > > > The HIP path should work on linux too. I generally think we should

[PATCH] D153725: [clang] Make amdgpu-arch tool work on Windows

2023-07-10 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. The right thing to do on Linux for this is to query the driver directly. That is, the kernel should populate some string under /sys that we read. That isn't yet implemented. Does windows happen to have that functionality available? (landed here while trying to

[PATCH] D154036: [libc] Add support for creating wrapper headers for offloading in clang

2023-07-06 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. OK, let's go with this. It's a fairly alarming mess localised quite closely to the language that requires the complexity, minimal damage to libc itself. Repository: rG

[PATCH] D153909: [AMDGPU] Always pass `-mcpu` to the `lld` linker

2023-06-27 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I thought lld did the right thing if you pass it raw IR, without specifying an arch, but on reflection it might just be silently miscompiling things. The test doesn't seem to involve a specific type of input file, does clang foo.ll -o a.out pass a mcpu flag

[PATCH] D153568: [ClangPackager] Add an option to extract inputs to an archive

2023-06-23 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. LGTM. As a meta level comment, we have far too many of these binary munging sorts of tools. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION

[PATCH] D153578: [Clang] Disable `libc` headers for offloading languages

2023-06-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Thanks, unblocked :) Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D153578/new/ https://reviews.llvm.org/D153578

[PATCH] D152867: OpenMP: Add a new test for constantexpr evaluation of math headers

2023-06-14 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Thanks! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D152867/new/ https://reviews.llvm.org/D152867 ___ cfe-commits

[PATCH] D152829: clang: Add start of header test for __clang_hip_libdevice_declares

2023-06-13 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. That seems like something we should fix. Visibility/dso really should be the same default. I don't know what that fp thing is CHANGES SINCE LAST ACTION https://reviews.llvm.org/D152829/new/ https://reviews.llvm.org/D152829

[PATCH] D152442: [LinkerWrapper] Support linking vendor bitcode late

2023-06-13 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. This LG, the complicated stuff is in the previous patch. Thanks Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D152442/new/

[PATCH] D152391: [Clang] Allow bitcode linking when the input is LLVM-IR

2023-06-13 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/CodeGen/CGCall.cpp:2001 -void CodeGenModule::getDefaultFunctionAttributes(StringRef Name, - bool HasOptnone, - bool

[PATCH] D138504: clang/HIP: Remove __llvm_amdgcn_* wrapper hacks

2023-06-08 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Yep, thanks. Anything under __llvm is fair game to change and there's a find& replace fix for anything that decided to call these directly. CHANGES SINCE LAST ACTION

[PATCH] D152391: [Clang] Allow bitcode linking when the input is LLVM-IR

2023-06-07 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. That's cleaner than I expected, thanks. Might be reasonable to factor out the method as an initial NFC then insert the call to it along with the new test case as the functional change. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION

[PATCH] D102107: [OpenMP] Codegen aggregate for outlined function captures

2023-06-06 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I'm trying to pick up the context for this and D95976 . Superficially it looks like lowering variadic functions in the compiler could be used to simplify quite a lot of this, @jdoerfert there's a comment from some time ago which

[PATCH] D151087: [Clang] Permit address space casts with 'reinterpret_cast' in C++

2023-05-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I like the current behaviour of bypass semantic checking and emit IR with the same number on it as my primary use case for feeding freestanding C++ to clang is as a convenient way to emit specific IR and I'm not that bothered about clang detecting mistakes

[PATCH] D149028: [Clang] Always pass `-fconvergent-functions` for GPU targets

2023-04-23 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I think this is sensible. Passing fno-convergent-functions presumably changes the default? I wonder if we should adopt this and then remove the checks for each of the GPU programming models Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION

[PATCH] D149019: [Clang] Accept and forward `-fconvergent-functions` in the driver

2023-04-23 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Thanks! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D149019/new/ https://reviews.llvm.org/D149019

[PATCH] D149019: [Clang] Accept and forward `-fconvergent-functions` in the driver

2023-04-23 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Can't reasonably see the semantic change between all the whitespace reformat, please split those two. E.g. use git-clang-format to only fix formatting in the part you're changing, or commit a clang-format only change first and then rebase this. Repository:

[PATCH] D147666: [OPENMP] Adds /lib to rpath to avoid need to set LD_LIBRARY_PATH to find plugins.

2023-04-06 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Sorry Greg, we can't have this back in trunk. It's now value add for the ROCm toolchain, and some llvm releases had this feature, but no longer. The consensus in the OpenMP weekly meeting was to require users to hack around with linker flags or environment

[PATCH] D146973: [Clang] Implicitly include LLVM libc headers for the GPU

2023-03-27 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Can we default to freestanding on, or just document that freestanding is a good idea, instead of hacking with include behaviour directly? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D146973/new/

[PATCH] D145941: [Clang] Always use -zdefs when linking AMDGPU images

2023-03-13 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a reviewer: AMDGPU. JonChesterfield added a comment. Adding the amdgpu reviewer group. This change might be contentious - we currently treat various broken code as calls to trap on the grounds that those functions might not be executed. By analogy, functions that call

[PATCH] D118493: Set rpath on openmp executables

2023-03-09 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. That is great news, phabricator's list of branches has mislead me. Thanks! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D118493/new/ https://reviews.llvm.org/D118493 ___

[PATCH] D118493: Set rpath on openmp executables

2023-03-09 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Duplicating a comment from the commit thread so it's easier for me to find later. You've applied this to the release branches going back as far as 14. It's a user facing breaking change. As in people who have a working openmp toolchain and update to the point

[PATCH] D118493: Set rpath on openmp executables

2023-03-08 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Reporting after another round of discussion. I don't have much support from the llvm openmp working group that we should maintain the current divergence from libc++ and the like so I'm backing down. Let's revert this and regress for all users who don't install

[PATCH] D142174: [OpenMP] Don't set rpath for system paths

2023-03-06 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. I'm happy with this but agree that "what might be a system path?" is a tricky heuristic. What we want is to exclude the places that the application will search anyway, but

[PATCH] D143306: [Driver] Default to -fno-openmp-implicit-rpath

2023-02-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. If fedora has a specific set of paths it rejects in DT_RUNPATH - and libomptarget is on one of those by default - and user executables find it there - and we can detect we're building on fedora then we can disable the rpath for fedora installs to system root,

[PATCH] D144505: [Clang] Add options in LTO mode when cross compiling for AMDGPU

2023-02-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. NVM the above, all the other call sites to addLTOOptions have that test in them so it must be fine Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION

[PATCH] D144505: [Clang] Add options in LTO mode when cross compiling for AMDGPU

2023-02-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Driver/ToolChains/AMDGPU.cpp:545 +addLTOOptions(getToolChain(), Args, CmdArgs, Output, Inputs[0], + C.getDriver().getLTOMode() == LTOK_Thin); CmdArgs.push_back("-shared");

[PATCH] D143306: [Driver] Default to -fno-openmp-implicit-rpath

2023-02-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield requested changes to this revision. JonChesterfield added a comment. This revision now requires changes to proceed. Marking this as "no" because as far as I can tell it'll stop anyone running openmp built from source which constitutes a severe regression and I am struggling to

[PATCH] D143306: [Driver] Default to -fno-openmp-implicit-rpath

2023-02-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/include/clang/Driver/Options.td:4218-4223 defm openmp_implicit_rpath: BoolFOption<"openmp-implicit-rpath", LangOpts<"OpenMP">, - DefaultTrue, + DefaultFalse, PosFlag, NegFlag, BothFlags<[NoArgumentUnused]>>;

[PATCH] D143306: [Driver] Default to -fno-openmp-implicit-rpath

2023-02-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. So what is this configuration file? Joseph found a Gentoo blog post https://blogs.gentoo.org/mgorny/2022/10/07/clang-in-gentoo-now-sets-default-runtimes-via-config-file/ and I don't have a clang.cfg file in my install dir Repository: rG LLVM Github Monorepo

[PATCH] D143306: [Driver] Default to -fno-openmp-implicit-rpath

2023-02-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I don't know how this configuration file works. Running clang from the build directory is useful when developing, but to avoid user facing regression we'd need it to run from the install directory. The use case is someone builds trunk or a release on a HPC

[PATCH] D143306: [Driver] Default to -fno-openmp-implicit-rpath

2023-02-21 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I don't mind hugely what mechanism is used but would really like clang++ -fopenmp foo.cpp to build a program that runs. How can we preserve that 'works' feature without setting rpath on the binary? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST

[PATCH] D143306: [Driver] Default to -fno-openmp-implicit-rpath

2023-02-04 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added reviewers: tianshilei1992, ye-luo, RaviNarayanaswamy. JonChesterfield added a comment. Added some people I can remember from the original discussion. The effect of this patch will be that clang -fopenmp foo.cpp will build an executable that doesn't know where its runtime

[PATCH] D118493: Set rpath on openmp executables

2023-01-24 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. That works if you have one toolchain installed globally or you are happy to cobble together a working system using environment variables. If you have multiple toolchains, they can't all sit in the global directory. If you don't have root, you can't install them

[PATCH] D142138: [OpenMP] Clean up AMD handling for `-fopenmp-targets=amdgcn` arch inference

2023-01-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a subscriber: arsenm. JonChesterfield added a comment. @arsenm as above, mcpu != march important? llc takes a different one to clang iirc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D142138/new/

[PATCH] D142133: [LinkerWrapper] Use `clang` to perform the device linking

2023-01-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Herald added a subscriber: ormris. That's much cleaner, thanks! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D142133/new/

[PATCH] D141859: [amdgpu-arch] Dynamically load the HSA runtime if not found during the build

2023-01-16 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Yes, ok. Not thrilled about the copy from openmp but we can fix that as soon as we agree on a subdir to put code shared between llvm/clang/openmp and that has been tricky to

[PATCH] D141440: [OpenMP] Adjust phases for AMDGPU offloading for OpenMP in save-temps mode

2023-01-10 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. If i'm reading this right, the change means we emit the same save-temps files as hip with the same naming convention. That sounds great, makes life easier for backend devs

[PATCH] D141105: [OpenMP] Add support for '--offload-arch=native' to OpenMP offloading

2023-01-06 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Possible naming hazard here. march=native means target the local processor architecture, zen2 or whatever, and we have the host CPU as an offloading target already. So what I'd expect this to do is host offloading with the openmp runtime compiled for the local

[PATCH] D140226: [NVPTX] Introduce attribute to mark kernels without a language mode

2022-12-18 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Calling convention is the right model here. Kernels are functions with a different calling convention to the 'normal' functions in a very literal sense. The calling convention modelling in clang is different to attribute handling and changing nvptx to it is

[PATCH] D140158: [CUDA] Allow targeting NVPTX directly without a host toolchain

2022-12-15 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. If we do magic header including, we should check for the freestanding argument and not include them with that set. I would prefer we not include cuda headers into C++ source that isn't being compiled as cuda, and also not link in misc cuda library files.

[PATCH] D137875: clang/AMDGPU: Drop volatile from builtin_atomic* pointers

2022-12-14 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Thanks! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D137875/new/ https://reviews.llvm.org/D137875 ___ cfe-commits

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2022-12-13 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I am reluctant to add the dependency edge to rocm device libs to openmp's GPU runtime. We currently require that library for libm, which I'm also not thrilled about, but at least you can currently build and run openmp programs (that don't use libm, like much

[PATCH] D138141: [amdgpu] Reimplement LDS lowering

2022-12-06 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 480484. JonChesterfield added a comment. Herald added subscribers: cfe-commits, libc-commits, libcxx-commits, openmp-commits, lldb-commits, Sanitizers, hanchung, kadircet, jsetoain, Moerafaat, zero9178, pcwang-thead, anlunx, steakhal, mtrofin, Enna1,

[PATCH] D138392: clang/HIP: Fix broken implementations of __make_mantissa* functions

2022-11-28 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Patch is obviously fine. Given these are internal functions, perhaps we should annotate them with the non-null attribute and delete the early exit. CHANGES SINCE LAST

[PATCH] D138862: [Clang] Do not set offload kind in a freestanding build

2022-11-28 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Does this change the annotation on kernels compiled from C or C++ with -ffreestanding? If so, probably want a test showing the change. I have a vague idea that they pick up a default of 'opencl' at present, where 'none' is probably better. Or perhaps this is

[PATCH] D138391: clang/HIP: Add new header test for math IR gen

2022-11-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Good call adding this test, thanks. I think some of the skipping is glibc quirk related and some was to avoid clobbering the user's namespace for openmp. The math header

[PATCH] D138473: clang/HIP: Inline frexp/frexpf implementations

2022-11-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Noisy diff! Getting rid of the openmp special case handling here is nice. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138473/new/ https://reviews.llvm.org/D138473

[PATCH] D137524: clang/AMDGPU: Emit atomicrmw for atomic_inc/dec builtins

2022-11-06 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I don't recognize atomicrmw udec_wrap and can't find it in https://llvm.org/docs/LangRef.html#atomicrmw-instruction. I do vaguely recall the semantics of these builtins (well, the instructions they target) being surprising, Do you know where the uinc_wrao etc

[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang

2022-10-03 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D130096#3816149 , @arsenm wrote: > I'd prefer to avoid spreading special treatment of the device libraries into > the backend. The contract is poorly defined and spread around too much as it > is

[PATCH] D135076: [Clang] Make offloading flags accept '-' and '--'

2022-10-03 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Like that a lot, good quality of life improvement. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D135076/new/

[PATCH] D82087: AMDGPU/clang: Add builtins for llvm.amdgcn.ballot

2022-09-20 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D82087#3803464 , @arsenm wrote: > In D82087#3797883 , @jdoerfert wrote: > >> Can we land this? I'd like to use the new intrinsics as I don't understand >> the old ones. > > What

[PATCH] D133726: [OpenMP][AMDGPU] Link bitcode ROCm device libraries per-TU

2022-09-14 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. I don't like this but will concede it's quicker than changing device libs to contain IR that doesn't have to be patched on the fly. Repository: rG LLVM Github Monorepo

[PATCH] D133726: [OpenMP][AMDGPU] Link bitcode ROCm device libraries per-TU

2022-09-13 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. march for openmp, mcpu for hip seems ok. Notably llc needs to be told this as well, using mcpu, which may be an issue for save-temps Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D133726/new/

[PATCH] D133726: [OpenMP][AMDGPU] Link bitcode ROCm device libraries per-TU

2022-09-12 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Driver/ToolChains/AMDGPUOpenMP.h:58 + llvm::SmallVector + getHIPDeviceLibs(const llvm::opt::ArgList ) const override; + jhuber6 wrote: > JonChesterfield wrote: > > Why hip device libs? There's a

[PATCH] D133726: [OpenMP][AMDGPU] Link bitcode ROCm device libraries per-TU

2022-09-12 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Driver/ToolChains/AMDGPUOpenMP.h:58 + llvm::SmallVector + getHIPDeviceLibs(const llvm::opt::ArgList ) const override; + Why hip device libs? There's a common set, plus a hip.bc plus a opencl.bc. I'd

[PATCH] D133726: [OpenMP][AMDGPU] Link bitcode ROCm device libraries per-TU

2022-09-12 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. We can do this but should expect an increase in code size from having multiple internalised copies of the same function. There may be an incidental benefit if we can specialise some functions to the call site without additional cloning. Address of the same

[PATCH] D131639: [OpenMP] Remove 'stdbool.h' from OpenMP header wrappers

2022-08-12 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. We should write down (maybe in the commit message) what this is fixing. The linked bug has someone defining bool as a workaround for msvc which shouldn't be applied when not compiling with msvc, so superficially it looks like this patch works around broken

[PATCH] D131307: [Clang] Allow downgrading to a warning the diagnostic for setting a non fixed enum to a value outside the range of the enumeration values

2022-08-09 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Did some digging here. The function hsa_agent_get_info takes an argument of type hsa_agent_info_t, which has declared values in the range [0 24]. The implementation of that (in amd_gpu_agent fwiw) casts that argument to a size_t and then switches on it,

[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang

2022-07-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. A safer bet is to use the current control flow that links in specific bitcode files, but create the global directly instead of linking in the file. That'll give us zero semantic change and a clang that ignores those bitcode files if present. Repository: rG

[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang

2022-07-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a reviewer: b-sumner. JonChesterfield added a comment. Tagging Brian as the code owner of rocm device libs - emitting these in clang would simplify that library. Currently clang reads these commandline flags and conditionally links in bitcode files to introduce these

[PATCH] D129534: [OpenMP] Do not link static library with `-nogpulib`

2022-07-12 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Thanks! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D129534/new/ https://reviews.llvm.org/D129534

[PATCH] D128914: [HIP] Add support for handling HIP in the linker wrapper

2022-07-11 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Code looks good to me. It's hard to be sure whether it works without running a bunch of hip test cases through it, have you already done so? If it doesn't work out of the

[PATCH] D129435: [Clang] Parse toolchain-specific offloading arguments directly

2022-07-11 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. LG, thanks! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D129435/new/ https://reviews.llvm.org/D129435

[PATCH] D127246: [LinkerWrapper] Rework the linker wrapper and use owning binaries

2022-06-21 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. I think this is probably OK. Smaller patches usually get reviewed faster so minimising the line noise in the browser is worthwhile. Comment at:

[PATCH] D127246: [LinkerWrapper] Rework the linker wrapper and use owning binaries

2022-06-21 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I've read it but can't promise it's correct - the diff is large and has some spurious noise in it which distracts significantly from the functional changes. Would you be willing to split this into two patches, one which renames variables and moves blocks of

[PATCH] D125970: [amdgpu] Add amdgpu_kernel calling conv attribute to clang

2022-05-23 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D125970#3531645 , @aaron.ballman wrote: > In D125970#3527685 , > @JonChesterfield wrote: > >> If it was adding a calling convention, sure - caution warranted. There's no >>

[PATCH] D125970: [amdgpu] Add amdgpu_kernel calling conv attribute to clang

2022-05-20 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. If it was adding a calling convention, sure - caution warranted. There's no llvm change here though, an existing CC is exposed to C++. No change to the type system either. I'll propose a patch with some documentation for it if you wish, but it'll just say "For

[PATCH] D125970: [amdgpu] Add amdgpu_kernel calling conv attribute to clang

2022-05-20 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. A clangd buildbot (https://lab.llvm.org/buildbot/#/builders/131/builds/27770) failed on this with [ RUN ] SerializationTest.NoCrashOnBadArraySize ==384111==ERROR: ThreadSanitizer failed to allocate 0x1 (65536) bytes of stack depot (error code: 12)

[PATCH] D125970: [amdgpu] Add amdgpu_kernel calling conv attribute to clang

2022-05-20 Thread Jon Chesterfield via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG83c431fb9e72: [amdgpu] Add amdgpu_kernel calling conv attribute to clang (authored by JonChesterfield). Repository: rG LLVM Github Monorepo

[PATCH] D125970: [amdgpu] Add amdgpu_kernel calling conv attribute to clang

2022-05-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Unintentionally created this patch against an older version of main and it interacted badly with D124998 on the rebase. Rerunning tests now, and will leave this open for further comments for a little while. Thanks all

[PATCH] D125970: [amdgpu] Add amdgpu_kernel calling conv attribute to clang

2022-05-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 430821. JonChesterfield added a comment. - Fix git merge misfires Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D125970/new/ https://reviews.llvm.org/D125970 Files: clang/include/clang/Basic/Attr.td

[PATCH] D125970: [amdgpu] Add amdgpu_kernel calling conv attribute to clang

2022-05-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 430820. JonChesterfield added a comment. - Fix git merge misfires Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D125970/new/ https://reviews.llvm.org/D125970 Files: clang/include/clang/Basic/Attr.td

[PATCH] D125970: [amdgpu] Add amdgpu_kernel calling conv attribute to clang

2022-05-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 430818. JonChesterfield added a comment. - Rebase on main Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D125970/new/ https://reviews.llvm.org/D125970 Files: clang/include/clang/Basic/Attr.td

[PATCH] D125970: [amdgpu] Add amdgpu_kernel calling conv attribute to clang

2022-05-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Thanks for accepting! I'm interested to learn more about how the calling conv works, e.g. if parts of it are implemented in clang and parts of it patched on the fly by opt, but that's downstream of easy access to writing C tests that use it. Repository: rG

[PATCH] D125970: [amdgpu] Add amdgpu_kernel calling conv attribute to clang

2022-05-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Added a codegen test for arg passing. It establishes that most arguments are left alone, but structs passed by value are handled as an addrspace(4) byref. Letting opt -O2 run annotated some argument pointers as being in addrspace(1) which I think is wrong. I

[PATCH] D125970: [amdgpu] Add amdgpu_kernel calling conv attribute to clang

2022-05-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 430801. JonChesterfield added a comment. - Add O0 arg passing codegen test Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D125970/new/ https://reviews.llvm.org/D125970 Files:

[PATCH] D125970: [amdgpu] Add amdgpu_kernel calling conv attribute to clang

2022-05-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. OK, so that's a different thing. CUDA/HIP has a bunch of rules about implicitly tagging things with addrspace(1) at the call boundary. I don't think any of that magic should exist for C or C++, the developer gets to spell out the address space stuff they want

  1   2   3   4   5   6   7   8   9   >